检查DataFrame中的哪些列是分类的

检查DataFrame中的哪些列是分类的

本文介绍了检查DataFrame中的哪些列是分类的的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是Pandas的新手...我想以一种简单通用的方法在我不手动指定每种列类型的情况下(在. df使用以下命令创建:

I am new to Pandas... I want to a simple and generic way to find which columns are categorical in my DataFrame, when I don't manually specify each column type, unlike in this SO question. The df is created with:

import pandas as pd
df = pd.read_csv("test.csv", header=None)

例如

           0         1         2         3        4
0   1.539240  0.423437 -0.687014   Chicago   Safari
1   0.815336  0.913623  1.800160    Boston   Safari
2   0.821214 -0.824839  0.483724  New York   Safari

.

更新(2018/02/04)该问题假定数字列不是类别,@ Zero的接受的答案可以解决此问题.

UPDATE (2018/02/04) The question assumes numerical columns are NOT categorical, @Zero's accepted answer solves this.

请小心-正如@Sagarkar的评论指出的那样,这并不总是正确的.困难在于数据类型和分类/有序/标称类型是正交的概念,因此在它们之间进行映射并不容易. @Jeff的答案下面指定了实现手动映射的精确方式.

BE CAREFUL - As @Sagarkar's comment points out that's not always true. The difficulty is that Data Types and Categorical/Ordinal/Nominal types are orthogonal concepts, thus mapping between them isn't straightforward. @Jeff's answer below specifies the precise manner to achieve the manual mapping.

推荐答案

您可以使用df._get_numeric_data()获取数字列,然后找出分类列

You could use df._get_numeric_data() to get numeric columns and then find out categorical columns

In [66]: cols = df.columns

In [67]: num_cols = df._get_numeric_data().columns

In [68]: num_cols
Out[68]: Index([u'0', u'1', u'2'], dtype='object')

In [69]: list(set(cols) - set(num_cols))
Out[69]: ['3', '4']

这篇关于检查DataFrame中的哪些列是分类的的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-30 07:27