本文介绍了Panadas基于Python中MULTIPLY分钟的滚动总和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有这个df
dateTime 1min hour minute X EXPECTED Rolling_X
2017-09-19 02:00:04 2017-09-19 02:00:00 2 0 93 93
2017-09-19 02:00:04 2017-09-19 02:00:00 2 0 1 94
2017-09-19 02:00:04 2017-09-19 02:00:00 2 0 1 95
2017-09-19 02:00:22 2017-09-19 02:00:00 2 0 2 97
2017-09-19 02:01:31 2017-09-19 02:01:00 2 1 0 97
2017-09-19 02:01:31 2017-09-19 02:01:00 2 1 1 98
2017-09-19 02:01:32 2017-09-19 02:01:00 2 1 1 99
2017-09-19 02:01:34 2017-09-19 02:01:00 2 1 0 99
2017-09-19 02:01:35 2017-09-19 02:01:00 2 1 0 99
2017-09-19 02:01:35 2017-09-19 02:01:00 2 1 0 99
2017-09-19 02:01:39 2017-09-19 02:01:00 2 1 1 100
2017-09-19 02:01:58 2017-09-19 02:01:00 2 1 2 102
2017-09-19 02:01:58 2017-09-19 02:01:00 2 1 0 102
2017-09-19 02:02:02 2017-09-19 02:02:00 2 2 3 3
2017-09-19 02:02:32 2017-09-19 02:02:00 2 2 0 3
2017-09-19 02:02:32 2017-09-19 02:02:00 2 2 1 4
2017-09-19 02:02:40 2017-09-19 02:02:00 2 2 15 19
2017-09-19 02:02:41 2017-09-19 02:02:00 2 2 6 25
2017-09-19 02:02:44 2017-09-19 02:02:00 2 2 1 26
2017-09-19 02:02:53 2017-09-19 02:02:00 2 2 3 29
2017-09-19 02:03:00 2017-09-19 02:03:00 2 3 1 30
2017-09-19 02:03:00 2017-09-19 02:03:00 2 3 1 31
2017-09-19 02:03:05 2017-09-19 02:03:00 2 3 1 32
2017-09-19 02:04:07 2017-09-19 02:04:00 2 4 7 7
2017-09-19 02:04:58 2017-09-19 02:04:00 2 4 2 9
2017-09-19 02:05:22 2017-09-19 02:05:00 2 5 11 20
2017-09-19 02:05:36 2017-09-19 02:05:00 2 5 11 31
我希望每2分钟获得一次滚动总和,因此每2分钟将按照上述df中的预期列将其重置.我正在使用下面的代码,但是它不起作用(尽管每1分钟使用一次它就可以工作)
I am looking to get the rolling sum every 2 minutes so every 2 minutes it will be reseted as in the expected column in the above df).I am using the following code but it's not working (though it does work when I use it per 1 minute)
s = df['dateTime'].dt.floor('T').diff().shift(-1).eq(pd.Timedelta('2 minutes'))
s1 = df['X'].cumsum()
df['2min_CumX'] = s.mul(s1).diff().where(lambda x: x < 0).ffill().add(s1, fill_value=0)
我在此处阅读了文档,看起来结构是正确,尽管无法正常工作.
I read the documentation here and looks like the structure is correct though it's not working as expected.
感谢您的帮助!
推荐答案
更改下限参数.
s = df['dateTime'].dt.floor('2T').diff().shift(-1).eq(pd.Timedelta('2 minutes'))
s1 = df['X'].cumsum()
df['2min_CumX'] = s.mul(s1).diff().where(lambda x: x < 0).ffill().add(s1, fill_value=0)
输出:
dateTime 1min hour minute X EXPECTED Rolling_X 2min_CumX
0 2017-09-19 02:00:04 2017-09-19 02:00:00 2 0 93 93 93.0
1 2017-09-19 02:00:04 2017-09-19 02:00:00 2 0 1 94 94.0
2 2017-09-19 02:00:04 2017-09-19 02:00:00 2 0 1 95 95.0
3 2017-09-19 02:00:22 2017-09-19 02:00:00 2 0 2 97 97.0
4 2017-09-19 02:01:31 2017-09-19 02:01:00 2 1 0 97 97.0
5 2017-09-19 02:01:31 2017-09-19 02:01:00 2 1 1 98 98.0
6 2017-09-19 02:01:32 2017-09-19 02:01:00 2 1 1 99 99.0
7 2017-09-19 02:01:34 2017-09-19 02:01:00 2 1 0 99 99.0
8 2017-09-19 02:01:35 2017-09-19 02:01:00 2 1 0 99 99.0
9 2017-09-19 02:01:35 2017-09-19 02:01:00 2 1 0 99 99.0
10 2017-09-19 02:01:39 2017-09-19 02:01:00 2 1 1 100 100.0
11 2017-09-19 02:01:58 2017-09-19 02:01:00 2 1 2 102 102.0
12 2017-09-19 02:01:58 2017-09-19 02:01:00 2 1 0 102 102.0
13 2017-09-19 02:02:02 2017-09-19 02:02:00 2 2 3 3 3.0
14 2017-09-19 02:02:32 2017-09-19 02:02:00 2 2 0 3 3.0
15 2017-09-19 02:02:32 2017-09-19 02:02:00 2 2 1 4 4.0
16 2017-09-19 02:02:40 2017-09-19 02:02:00 2 2 15 19 19.0
17 2017-09-19 02:02:41 2017-09-19 02:02:00 2 2 6 25 25.0
18 2017-09-19 02:02:44 2017-09-19 02:02:00 2 2 1 26 26.0
19 2017-09-19 02:02:53 2017-09-19 02:02:00 2 2 3 29 29.0
20 2017-09-19 02:03:00 2017-09-19 02:03:00 2 3 1 30 30.0
21 2017-09-19 02:03:00 2017-09-19 02:03:00 2 3 1 31 31.0
22 2017-09-19 02:03:05 2017-09-19 02:03:00 2 3 1 32 32.0
23 2017-09-19 02:04:07 2017-09-19 02:04:00 2 4 7 7 7.0
24 2017-09-19 02:04:58 2017-09-19 02:04:00 2 4 2 9 9.0
25 2017-09-19 02:05:22 2017-09-19 02:05:00 2 5 11 20 20.0
26 2017-09-19 02:05:36 2017-09-19 02:05:00 2 5 11 31 31.0
这篇关于Panadas基于Python中MULTIPLY分钟的滚动总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!