如何使用pandas Groupby将不同的聚合函数应用于同一列

本文介绍了如何使用pandas Groupby将不同的聚合函数应用于同一列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这样做很清楚

 data.groupby(['A','B']).mean()

我们通过"A"级和"B"级获得多索引，并在每一列的均值中得到一列

We get something multiindex by level 'A' and 'B' and one column with the mean of each group

我怎么能同时拥有count()和std()?

how could I have the count(), std() simultaneously ?

所以结果看起来像是在数据框中

so result looks like in a dataframe

A   B    mean   count   std

推荐答案

以下方法应该起作用:

data.groupby(['A','B']).agg([pd.Series.mean, pd.Series.std, pd.Series.count])

基本上调用 agg 并传递函数列表将生成应用了这些函数的多列.

basically call agg and passing a list of functions will generate multiple columns with those functions applied.

示例:

In [12]:

df = pd.DataFrame({'a':np.random.randn(5), 'b':[0,0,1,1,2]})
df.groupby(['b']).agg([pd.Series.mean, pd.Series.std, pd.Series.count])
Out[12]:
          a
       mean       std count
b
0 -0.769198  0.158049     2
1  0.247708  0.743606     2
2 -0.312705       NaN     1

您还可以传递方法名称的字符串，常用的方法可以工作，一些较晦涩的方法我不记得该使用哪种方法，但是在这种情况下，它们可以正常工作，这要归功于@ajcr的建议:

You can also pass the string of the method names, the common ones work, some of the more obscure ones don't I can't remember which but in this case they work fine, thanks to @ajcr for the suggestion:

In [16]:
df = pd.DataFrame({'a':np.random.randn(5), 'b':[0,0,1,1,2]})
df.groupby(['b']).agg(['mean', 'std', 'count'])

Out[16]:
          a
       mean       std count
b
0 -1.037301  0.790498     2
1 -0.495549  0.748858     2
2 -0.644818       NaN     1

这篇关于如何使用pandas Groupby将不同的聚合函数应用于同一列的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！