DataFrame中的字符串

DataFrame中的字符串

本文介绍了DataFrame中的字符串,但dtype是object的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为什么Pandas告诉我我有对象,尽管所选列中的每个项目都是一个字符串,即使经过显式转换也是如此.

Why does Pandas tell me that I have objects, although every item in the selected column is a string — even after explicit conversion.

这是我的数据框:

<class 'pandas.core.frame.DataFrame'>
Int64Index: 56992 entries, 0 to 56991
Data columns (total 7 columns):
id            56992  non-null values
attr1         56992  non-null values
attr2         56992  non-null values
attr3         56992  non-null values
attr4         56992  non-null values
attr5         56992  non-null values
attr6         56992  non-null values
dtypes: int64(2), object(5)

其中有五个是dtype object.我将这些对象明确转换为字符串:

Five of them are dtype object. I explicitly convert those objects to strings:

for c in df.columns:
    if df[c].dtype == object:
        print "convert ", df[c].name, " to string"
        df[c] = df[c].astype(str)

然后,df["attr2"]仍然具有dtype object,尽管type(df["attr2"].ix[0]显示str,这是正确的.

Then, df["attr2"] still has dtype object, although type(df["attr2"].ix[0] reveals str, which is correct.

熊猫区分int64float64以及object.没有dtype str时,其背后的逻辑是什么?为什么strobject覆盖?

Pandas distinguishes between int64 and float64 and object. What is the logic behind it when there is no dtype str? Why is a str covered by object?

推荐答案

dtype对象来自NumPy,它描述ndarray中元素的类型. ndarray中的每个元素都必须具有相同的字节大小.对于int64和float64,它们是8个字节.但是对于字符串,字符串的长度不是固定的.因此,熊猫不是直接将字符串的字节保存在ndarray中,而是使用对象ndarray来保存指向对象的指针,因此,这种ndarray的dtype是object.

The dtype object comes from NumPy, it describes the type of element in a ndarray. Every element in a ndarray must has the same size in byte. For int64 and float64, they are 8 bytes. But for strings, the length of the string is not fixed. So instead of save the bytes of strings in the ndarray directly, Pandas use object ndarray, which save pointers to objects, because of this the dtype of this kind ndarray is object.

这里是一个例子:

  • int64数组包含4个int64值.
  • 对象数组包含指向3个字符串对象的4个指针.

这篇关于DataFrame中的字符串,但dtype是object的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-05 11:44