本文介绍了关于使用Python在Pandas中使用多列进行groupby操作的含义感到困惑的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
这是我的数据框,功能:
Here are my dataframe, function:
df = pd.DataFrame({
'G': 'x x y y'.split(),
'C': [1, 2, 1, 2],
'D': [2, 2, 1, 1]})
def CD(df):
return df['C'] * df['D']
这是我的数据框的样子:
Here is what my dataframe looks like:
G C D
0 x 1 2
1 x 2 2
2 y 1 1
3 y 2 1
我跑步时
df.groupby('G').apply(CD)
我希望得到x和y的总和才能得到
I expected that it would sum over x and y to get
G C D
0 x 3 4
1 y 3 2
然后,我希望它将C和D相乘得到
Then, I expected it to multiply C and D to get
x 12
y 6
但是,我知道了
G
x 0 2
1 4
y 2 1
3 2
[2,4,1,2]的新列看起来与我简单地运行所获得的内容没什么不同
That new column of [2, 4, 1, 2] doesn't look any different than what I would have obtained if I simply ran
df['C'] * df['D']
很显然,我对groupby的功能感到困惑.什么是"df.groupby('G').apply(CD)"在我的示例中做什么?
Clearly, I am confused about what groupby does. What is "df.groupby('G').apply(CD)" doing in my example?
推荐答案
Groupby不会求和.尝试套用apply(sum)并将结果发送到您的函数.
Groupby does not do the sum. Try apply(sum) and sent the results to your function.
>> CD(df.groupby('G')[['C','D']].apply(sum))
G
x 12
y 6
dtype: int64
这篇关于关于使用Python在Pandas中使用多列进行groupby操作的含义感到困惑的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!