将列标题添加到新的

将列标题添加到新的

本文介绍了将列标题添加到新的 pandas 数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用.groupby.size方法从先前的数据帧创建一个新的熊猫数据帧.

I am creating a new pandas dataframe from a previous dataframe using the .groupby and .size methods.

[in] results = df.groupby(["X", "Y", "Z", "F"]).size()

[out]
    9   27/02/2016  1   N   326
    9   27/02/2016  1   S   332
    9   27/02/2016  2   N   280
    9   27/02/2016  2   S   353
    9   27/02/2016  3   N   177

这表现出预期的效果,但是结果是一个没有列标题的数据框.

This behaves as expected, however the result is a dataframe with no column headers.

SO 问题指出,以下内容将列名称添加到了生成的数据框中

This SO question states that the following adds column names to the generated dataframe

[in] results.columns = ["X","Y","Z","F","Count"]

但是,这似乎根本没有任何影响.

However, this does not seem to have any impact at all.

[out]
        9   27/02/2016  1   N   326
        9   27/02/2016  1   S   332
        9   27/02/2016  2   N   280
        9   27/02/2016  2   S   353
        9   27/02/2016  3   N   177

推荐答案

您看到的是将分组的列作为索引,如果调用reset_index,它将恢复列名

What you're seeing are your grouped columns as the index, if you call reset_index then it restores the column names

如此

results = df.groupby(["X", "Y", "Z", "F"]).size()
results.reset_index()

应该工作

In [11]:
df.groupby(["X","Y","Z","F"]).size()

Out[11]:
X  Y           Z  F
9  27/02/2016  1  N    1
                  S    1
               2  N    1
                  S    1
               3  N    1
dtype: int64

In [12]:
df.groupby(["X","Y","Z","F"]).size().reset_index()

Out[12]:
   X           Y  Z  F  0
0  9  27/02/2016  1  N  1
1  9  27/02/2016  1  S  1
2  9  27/02/2016  2  N  1
3  9  27/02/2016  2  S  1
4  9  27/02/2016  3  N  1

另外,您可以使用count实现您想要的:

Additionally you can achieve what you want by using count:

In [13]:
df.groupby(["X","Y","Z","F"]).count().reset_index()

Out[13]:
   X           Y  Z  F  Count
0  9  27/02/2016  1  N      1
1  9  27/02/2016  1  S      1
2  9  27/02/2016  2  N      1
3  9  27/02/2016  2  S      1
4  9  27/02/2016  3  N      1

您还可以在此处传递参数as_index=False:

You could also pass param as_index=False here:

In [15]:
df.groupby(["X","Y","Z","F"], as_index=False).count()

Out[15]:
   X           Y  Z  F  Count
0  9  27/02/2016  1  N      1
1  9  27/02/2016  1  S      1
2  9  27/02/2016  2  N      1
3  9  27/02/2016  2  S      1
4  9  27/02/2016  3  N      1

通常这很好,但是如果您尝试在无法聚集dtypes的列上使用聚集方法,则某些聚集函数会失败,例如,如果您具有str dtypes并且您决定调用mean.

This is normally fine but some aggregate functions will bork if you try to use aggregation methods on columns whose dtypes cannot be aggregated, for instance if you have str dtypes and you decide to call mean for instance.

这篇关于将列标题添加到新的 pandas 数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-24 04:26