本文介绍了Panadas基于python中的分钟滚动总和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有这个df,我想对滚动进行总结(对groupby进行不总结)
I have this df and I want to sum with rolling (and not with groupby)
dateTime 1min hour minute X EXPECTED Rolling_X
2017-09-19 02:00:04 2017-09-19 02:00:00 2 0 93 93
2017-09-19 02:00:04 2017-09-19 02:00:00 2 0 1 94
2017-09-19 02:00:04 2017-09-19 02:00:00 2 0 1 95
2017-09-19 02:00:04 2017-09-19 02:00:00 2 0 0 95
2017-09-19 02:00:04 2017-09-19 02:00:00 2 0 0 95
2017-09-19 02:00:22 2017-09-19 02:00:00 2 0 1 96
2017-09-19 02:00:22 2017-09-19 02:00:00 2 0 1 97
2017-09-19 02:00:22 2017-09-19 02:00:00 2 0 2 99
2017-09-19 02:00:58 2017-09-19 02:00:00 2 0 1 100
2017-09-19 02:00:58 2017-09-19 02:00:00 2 0 0 100
2017-09-19 02:01:00 2017-09-19 02:01:00 2 1 7 7
2017-09-19 02:01:00 2017-09-19 02:01:00 2 1 0 7
2017-09-19 02:01:00 2017-09-19 02:01:00 2 1 0 7
2017-09-19 02:01:31 2017-09-19 02:01:00 2 1 0 7
2017-09-19 02:01:31 2017-09-19 02:01:00 2 1 1 8
2017-09-19 02:01:32 2017-09-19 02:01:00 2 1 1 9
2017-09-19 02:01:34 2017-09-19 02:01:00 2 1 0 9
2017-09-19 02:01:35 2017-09-19 02:01:00 2 1 0 9
2017-09-19 02:01:35 2017-09-19 02:01:00 2 1 0 9
2017-09-19 02:01:39 2017-09-19 02:01:00 2 1 1 10
2017-09-19 02:01:58 2017-09-19 02:01:00 2 1 2 12
2017-09-19 02:01:58 2017-09-19 02:01:00 2 1 0 12
2017-09-19 02:02:02 2017-09-19 02:02:00 2 2 3 3
2017-09-19 02:02:32 2017-09-19 02:02:00 2 2 0 3
2017-09-19 02:02:32 2017-09-19 02:02:00 2 2 1 4
2017-09-19 02:02:40 2017-09-19 02:02:00 2 2 0 4
2017-09-19 02:02:41 2017-09-19 02:02:00 2 2 0 4
2017-09-19 02:02:44 2017-09-19 02:02:00 2 2 1 5
2017-09-19 02:02:53 2017-09-19 02:02:00 2 2 1 6
2017-09-19 02:03:00 2017-09-19 02:03:00 2 3 1 1
2017-09-19 02:03:00 2017-09-19 02:03:00 2 3 1 2
2017-09-19 02:03:05 2017-09-19 02:03:00 2 3 1 3
2017-09-19 02:03:06 2017-09-19 02:03:00 2 3 0 3
问题是我需要每分钟滚动的总和,所以问题是:如何根据分钟的变化重设滚动总和?
The problem is I need the rolling sum per minute so the question is:how do I reset the rolling sum based on a change in the minutes?
df.X.cumsum()
如何在其中添加重置?
谢谢!类似于此处
推荐答案
在:
s = df['dateTime'].dt.floor('T').diff().shift(-1).eq(pd.Timedelta('1 minute'))
s1 = df['X'].cumsum()
df['CumX'] = s.mul(s1).diff().where(lambda x: x < 0).ffill().add(s1, fill_value=0)
输出:
dateTime 1min hour minute X EXPECTED Rolling_X CumX
0 2017-09-19 02:00:04 2017-09-19 02:00:00 2 0 93 93 93.0
1 2017-09-19 02:00:04 2017-09-19 02:00:00 2 0 1 94 94.0
2 2017-09-19 02:00:04 2017-09-19 02:00:00 2 0 1 95 95.0
3 2017-09-19 02:00:04 2017-09-19 02:00:00 2 0 0 95 95.0
4 2017-09-19 02:00:04 2017-09-19 02:00:00 2 0 0 95 95.0
5 2017-09-19 02:00:22 2017-09-19 02:00:00 2 0 1 96 96.0
6 2017-09-19 02:00:22 2017-09-19 02:00:00 2 0 1 97 97.0
7 2017-09-19 02:00:22 2017-09-19 02:00:00 2 0 2 99 99.0
8 2017-09-19 02:00:58 2017-09-19 02:00:00 2 0 1 100 100.0
9 2017-09-19 02:00:58 2017-09-19 02:00:00 2 0 0 100 100.0
10 2017-09-19 02:01:00 2017-09-19 02:01:00 2 1 7 7 7.0
11 2017-09-19 02:01:00 2017-09-19 02:01:00 2 1 0 7 7.0
12 2017-09-19 02:01:00 2017-09-19 02:01:00 2 1 0 7 7.0
13 2017-09-19 02:01:31 2017-09-19 02:01:00 2 1 0 7 7.0
14 2017-09-19 02:01:31 2017-09-19 02:01:00 2 1 1 8 8.0
15 2017-09-19 02:01:32 2017-09-19 02:01:00 2 1 1 9 9.0
16 2017-09-19 02:01:34 2017-09-19 02:01:00 2 1 0 9 9.0
17 2017-09-19 02:01:35 2017-09-19 02:01:00 2 1 0 9 9.0
18 2017-09-19 02:01:35 2017-09-19 02:01:00 2 1 0 9 9.0
19 2017-09-19 02:01:39 2017-09-19 02:01:00 2 1 1 10 10.0
20 2017-09-19 02:01:58 2017-09-19 02:01:00 2 1 2 12 12.0
21 2017-09-19 02:01:58 2017-09-19 02:01:00 2 1 0 12 12.0
22 2017-09-19 02:02:02 2017-09-19 02:02:00 2 2 3 3 3.0
23 2017-09-19 02:02:32 2017-09-19 02:02:00 2 2 0 3 3.0
24 2017-09-19 02:02:32 2017-09-19 02:02:00 2 2 1 4 4.0
25 2017-09-19 02:02:40 2017-09-19 02:02:00 2 2 0 4 4.0
26 2017-09-19 02:02:41 2017-09-19 02:02:00 2 2 0 4 4.0
27 2017-09-19 02:02:44 2017-09-19 02:02:00 2 2 1 5 5.0
28 2017-09-19 02:02:53 2017-09-19 02:02:00 2 2 1 6 6.0
29 2017-09-19 02:03:00 2017-09-19 02:03:00 2 3 1 1 1.0
30 2017-09-19 02:03:00 2017-09-19 02:03:00 2 3 1 2 2.0
31 2017-09-19 02:03:05 2017-09-19 02:03:00 2 3 1 3 3.0
32 2017-09-19 02:03:06 2017-09-19 02:03:00 2 3 0 3 3.0
这篇关于Panadas基于python中的分钟滚动总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!