问题描述
我正在尝试从熊猫数据框计算一些统计信息.看起来像这样:
I am trying to calculate some statistics from a pandas dataframe. It looks something like this:
id value conditional
1 10 0
2 20 0
3 30 1
1 15 1
3 5 0
1 10 1
因此,我需要计算从顶部到底部的每个id
列value
的累积总和,但仅当conditional
为1时.
So, I need to calculate the cumulative sum of the column value
for each id
from top to botom, but only when conditional
is 1.
所以,这应该给我类似的东西
So, this should give me something like:
id value conditional cumulative sum
1 10 0 0
2 20 0 0
3 30 1 30
1 15 1 15
3 5 0 30
1 10 1 25
因此,仅当第4行和第6行中的conditional=1
和第1行值不计数时,才采用id=1
的总和.如何在熊猫中做到这一点?
So, the sum of id=1
is taken only when conditional=1
in the 4th and 6th row and the 1st row value is not counted. How do I do this in pandas?
推荐答案
您可以创建一个序列,该序列是value
和conditional
的乘积,并对每个id组取其累加和:
You can create a Series that is the multiplication of value
and conditional
, and take the cumulative sum of it for each id group:
df['cumsum'] = (df['value']*df['conditional']).groupby(df['id']).cumsum()
df
Out:
id value conditional cumsum
0 1 10 0 0
1 2 20 0 0
2 3 30 1 30
3 1 15 1 15
4 3 5 0 30
5 1 10 1 25
这篇关于 pandas :一列的总和基于另一列的价值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!