问题描述
最好通过示例进行说明:
Best to illustrate by example:
我想通过col1
和col2
聚合一个DataFrame,将col3
和col4
上的结果求和,并对col5
I would like to aggregate a DataFrame by col1
and col2
, summing results on col3
and col4
and averaging results on col5
如果我只是想对col3-5求和,我会使用df.groupby(['col1','col2']).sum()
If I just wanted to sum on col3-5 I'd use df.groupby(['col1','col2']).sum()
推荐答案
您可以为此使用Groupby.agg()
(或Groupby.aggregate()
)方法.
You can use the Groupby.agg()
(or Groupby.aggregate()
) method for this.
aggregate()
函数可以接受字典作为参数,在这种情况下,它将键视为列名,将值视为用于聚合的函数.如文档-
aggregate()
function can accept a dictionary as argument, in which case it treats the keys as the column names and the value as the function to use for aggregating. As given in the documentation -
示例-
import numpy as np
result = df.groupby(['col1','col2']).agg({'col3':'sum','col4':'sum','col5':np.average})
演示-
In [50]: df = pd.DataFrame([[1,2,3,4,5],[1,2,6,7,8],[2,3,4,5,6]],columns=list('ABCDE'))
In [51]: df
Out[51]:
A B C D E
0 1 2 3 4 5
1 1 2 6 7 8
2 2 3 4 5 6
In [52]: df.groupby(['A','B']).aggregate({'C':np.sum,'D':np.sum,'E':np.average})
Out[52]:
C E D
A B
1 2 9 6.5 11
2 3 4 6.0 5
这篇关于在 pandas 中对不同列使用不同功能的groupby的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!