本文介绍了使用定义的dtypes初始化pandas DataFrame的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
pd.DataFrame
文档字符串为整个数据帧指定一个标量参数:
The pd.DataFrame
docstring specifies a scalar argument for the whole dataframe:
dtype : dtype, default None Data type to force, otherwise infer
dtype : dtype, default None Data type to force, otherwise infer
貌似确实是标量,因为以下情况会导致错误:
Seemingly it is indeed intended to be a scalar, as following leads to an error:
dfbinseq = pd.DataFrame([],
columns = ["chr", "centre", "seq_binary"],
dtype = ["O", pd.np.int64, "O"])
dfbinseq = pd.DataFrame([],
columns = ["chr", "centre", "seq_binary"],
dtype = [pd.np.object, pd.np.int64, pd.np.object])
为我创建空数据框(我需要将其放入HDF5存储中以供其他append
使用)的唯一解决方法是
The only workaround for creating an empty data frame (which I need to put in a HDF5 store for further append
s) for me was
dfbinseq.centre.dtype = np.int64
有没有一种方法可以一次设置dtypes
自变量?
Is there a way to set dtypes
arguments at once?
推荐答案
您可以将dtype
设置为Series
:
import pandas as pd
df = pd.DataFrame({'A':pd.Series([], dtype='str'),
'B':pd.Series([], dtype='int'),
'C':pd.Series([], dtype='float')})
print (df)
Empty DataFrame
Columns: [A, B, C]
Index: []
print (df.dtypes)
A object
B int32
C float64
dtype: object
有数据:
df = pd.DataFrame({'A':pd.Series([1,2,3], dtype='str'),
'B':pd.Series([4,5,6], dtype='int'),
'C':pd.Series([7,8,9], dtype='float')})
print (df)
A B C
0 1 4 7.0
1 2 5 8.0
2 3 6 9.0
print (df.dtypes)
A object
B int32
C float64
dtype: object
这篇关于使用定义的dtypes初始化pandas DataFrame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!