本文介绍了 pandas :按两列分组以获得另一列的总和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我查看了之前提出的大多数问题,但无法找到我的问题的答案:

I look most of the previously asked questions but was not able to find answer for my question:

我有以下data.frame

I have following data.frame

           id   year month score num_attempts
0      483625  2010    01   50      1
1      967799  2009    03   50      1
2      213473  2005    09  100      1
3      498110  2010    12   60      1
5      187243  2010    01  100      1
6      508311  2005    10   15      1
7      486688  2005    10   50      1
8      212550  2005    10  500      1
10     136701  2005    09   25      1
11     471651  2010    01   50      1

我要获取以下数据框

year month sum_score sum_num_attempts
2009    03   50           1
2005    09  125           2
2010    12   60           1
2010    01  200           2
2005    10  565           3

这是我尝试过的:

sum_df = df.groupby(by=['year','month'])['score'].sum()

但这看起来效率不高且不正确.如果我需要汇总多个列,这似乎是一个非常昂贵的电话.例如,如果我还有另一列num_attempts,并且只想按年份月份作为分数求和.

But this doesn't look efficient and correct. If I have more than one column need to be aggregate this seems like a very expensive call. for example if I have another column num_attempts and just want to sum by year month as score.

推荐答案

这应该是一种有效的方法:

This should be an efficient way:

sum_df = df.groupby(['year','month']).agg({'score': 'sum', 'num_attempts': 'sum'})

这篇关于 pandas :按两列分组以获得另一列的总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

05-23 01:47