本文介绍了将行追加到Pandas DataFrame会添加0列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在创建一个Pandas DataFrame来存储数据.不幸的是,我不知道我会提前得到多少行数据.因此,我的方法如下.

I'm creating a Pandas DataFrame to store data. Unfortunately, I can't know the number of rows of data that I'll have ahead of time. So my approach has been the following.

首先,我声明一个空的DataFrame.

First, I declare an empty DataFrame.

df = DataFrame(columns=['col1', 'col2'])

然后,我添加一行缺失值.

Then, I append a row of missing values.

df = df.append([None] * 2, ignore_index=True)

最后,我可以一次将值插入此DataFrame的一个单元格中. (为什么我一次必须做一个单元格是一个漫长的故事.)

Finally, I can insert values into this DataFrame one cell at a time. (Why I have to do this one cell at a time is a long story.)

df['col1'][0] = 3.28

除了append语句向我的DataFrame插入了另外一列之外,这种方法可以很好地工作.在该过程的最后,我键入df时看到的输出看起来像这样(包含100行数据).

This approach works perfectly fine, with the exception that the append statement inserts an additional column to my DataFrame. At the end of the process the output I see when I type df looks like this (with 100 rows of data).

<class 'pandas.core.frame.DataFrame'>
Data columns (total 2 columns):
0            0  non-null values
col1         100  non-null values
col2         100  non-null values

df.head()看起来像这样.

      0   col1   col2
0  None   3.28      1
1  None      1      0
2  None      1      0
3  None      1      0
4  None      1      1

是否有任何想法导致此0列出现在我的DataFrame中?

Any thoughts on what is causing this 0 column to appear in my DataFrame?

推荐答案

附加程序试图将一列附加到您的数据框中.它试图追加的列没有命名,并且其中有两个None/Nan元素,熊猫将(默认情况下)命名为名为0的列.

The append is trying to append a column to your dataframe. The column it is trying to append is not named and has two None/Nan elements in it which pandas will name (by default) as column named 0.

为了成功完成此操作,数据框后面的列名称必须与当前数据框的列名称一致,否则将创建新列(默认情况下)

In order to do this successfully, the column names coming into the append for the data frame must be consistent with the current data frame column names or else new columns will be created (by default)

#you need to explicitly name the columns of the incoming parameter in the append statement
df = DataFrame(columns=['col1', 'col2'])
print df.append(Series([None]*2, index=['col1','col2']), ignore_index=True)


#as an aside

df = DataFrame(np.random.randn(8, 4), columns=['A','B','C','D'])
dfRowImproper = [1,2,3,4]
#dfRowProper = DataFrame(arange(4)+1,columns=['A','B','C','D']) #will not work!!! because arange returns a vector, whereas DataFrame expect a matrix/array#
dfRowProper = DataFrame([arange(4)+1],columns=['A','B','C','D']) #will work


print df.append(dfRowImproper) #will make the 0 named column with 4 additional rows defined on this column

print df.append(dfRowProper) #will work as you would like as the column names are consistent

print df.append(DataFrame(np.random.randn(1,4))) #will define four additional columns to the df with 4 additional rows


print df.append(Series(dfRow,index=['A','B','C','D']), ignore_index=True) #works as you want

这篇关于将行追加到Pandas DataFrame会添加0列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-14 14:02