问题描述
我正在使用R和Python,并且我想将其中一个熊猫DataFrame作为羽毛编写,以便可以在R中更轻松地使用它.但是,当我尝试将其编写为羽毛时,我得到了以下错误:
I'm working with both R and Python and I want to write one of my pandas DataFrames as a feather so I can work with it more easily in R. However, when I try to write it as a feather, I get the following error:
ArrowInvalid: trying to convert NumPy type float64 but got float32
我仔细检查了我的列类型,它们已经是浮点数64:
I doubled checked my column types and they are already float 64:
In[1]
df.dtypes
Out[1]
id Object
cluster int64
vector_x float64
vector_y float64
无论使用feather.write_dataframe(df, "path/df.feather")
还是df.to_feather("path/df.feather")
,我都会遇到相同的错误.
I get the same error regardless of using feather.write_dataframe(df, "path/df.feather")
or df.to_feather("path/df.feather")
.
我在GitHub上看到了此消息,但不知道它是否相关: https://issues.apache.org/jira/browse/ARROW-1345 和 https://github.com/apache/arrow/issues/1430
I saw this on GitHub but didn't understand if it was related or not: https://issues.apache.org/jira/browse/ARROW-1345 and https://github.com/apache/arrow/issues/1430
最后,我可以将其保存为csv并更改R中的列(或仅使用Python进行整个分析),但是我希望使用它.
In the end, I can just save it as a csv and change the columns in R (or just do the whole analysis in Python), but I was hoping to use this.
尽管下面有很好的建议,但仍然存在相同的问题,因此请更新我的尝试.
Still having the same issue despite the great advice below so updating what I've tried.
df[['vector_x', 'vector_y', 'cluster']] = df[['vector_x', 'vector_y', 'cluster']].astype(float)
df[['doc_id', 'text']] = df[['doc_id', 'text']].astype(str)
df[['doc_vector', 'doc_vectors_2d']] = df[['doc_vector', 'doc_vectors_2d']].astype(list)
df.dtypes
Out[1]:
doc_id object
text object
doc_vector object
cluster float64
doc_vectors_2d object
vector_x float64
vector_y float64
dtype: object
经过大量搜索,看来问题出在我的集群列是由int64整数组成的列表类型.所以我想真正的任务是,羽毛格式支持列表吗?
After much searching, it appears that the issue is that my cluster column is a list type made up of int64 integers. So I guess the real quest is, does feather format support lists?
仅仅将其绑在弓上,羽毛至少还不支持列表之类的嵌套数据类型.
Just to tie this in a bow, feather does not support nested data types like lists, at least not yet.
推荐答案
经过大量研究,简单的答案是feather不支持列表(或其他嵌套数据类型)列.
After much research, the simple answer is that feather does not support list (or other nested data type) columns.
这篇关于尝试将DataFrame写入Feather时出错.羽毛支持列表列吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!