问题描述
我是Pandas的新手...我想以一种简单通用的方法在我不手动指定每种列类型的情况下(在. df
使用以下命令创建:
I am new to Pandas... I want to a simple and generic way to find which columns are categorical
in my DataFrame
, when I don't manually specify each column type, unlike in this SO question. The df
is created with:
import pandas as pd
df = pd.read_csv("test.csv", header=None)
例如
0 1 2 3 4
0 1.539240 0.423437 -0.687014 Chicago Safari
1 0.815336 0.913623 1.800160 Boston Safari
2 0.821214 -0.824839 0.483724 New York Safari
.
更新(2018/02/04)该问题假定数字列不是类别,@ Zero的接受的答案可以解决此问题.
UPDATE (2018/02/04) The question assumes numerical columns are NOT categorical, @Zero's accepted answer solves this.
请小心-正如@Sagarkar的评论指出的那样,这并不总是正确的.困难在于数据类型和分类/有序/标称类型是正交的概念,因此在它们之间进行映射并不容易. @Jeff的答案下面指定了实现手动映射的精确方式.
BE CAREFUL - As @Sagarkar's comment points out that's not always true. The difficulty is that Data Types and Categorical/Ordinal/Nominal types are orthogonal concepts, thus mapping between them isn't straightforward. @Jeff's answer below specifies the precise manner to achieve the manual mapping.
推荐答案
您可以使用df._get_numeric_data()
获取数字列,然后找出分类列
You could use df._get_numeric_data()
to get numeric columns and then find out categorical columns
In [66]: cols = df.columns
In [67]: num_cols = df._get_numeric_data().columns
In [68]: num_cols
Out[68]: Index([u'0', u'1', u'2'], dtype='object')
In [69]: list(set(cols) - set(num_cols))
Out[69]: ['3', '4']
这篇关于检查DataFrame中的哪些列是分类的的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!