我有熊猫df['realize']
time realize
2016-01-18 08:25:00 -46.369083
2016-01-19 14:30:00 -819.010738
2016-01-20 11:10:00 -424.955847
2016-01-21 07:15:00 27.523859
2016-01-21 16:10:00 898.522762
2016-01-25 00:00:00 761.063545
其中
time
是:df.index = df['time']
df.index = pd.to_datetime(df.index)
其中
df['realize']
是:In: type(df['realize'])
Out: pandas.core.series.Series
我要计算连续值,规则很简单(
df['realize'] > 0, df['realize'] < 0
)预期输出:
time realize Consecutive
2016-01-18 08:25:00 -46.369083 1
2016-01-19 14:30:00 -819.010738 2
2016-01-20 11:10:00 -424.955847 3
2016-01-21 07:15:00 27.523859 1
2016-01-21 16:10:00 898.522762 2
2016-01-25 00:00:00 761.063545 3
我阅读了有关循环的主题,但没有找到我所需要的。在此先感谢您的帮助。
最佳答案
您可以执行以下操作:
g = df.realize.gt(0).astype(int).diff().fillna(0).abs().cumsum()
df['Consecutive'] = df.groupby(g).realize.cumcount().add(1)
time realize Consecutive
0 2016-01-18 08:25:00 -46.369083 1
1 2016-01-19 14:30:00 -819.010738 2
2 2016-01-20 11:10:00 -424.955847 3
3 2016-01-21 07:15:00 27.523859 1
4 2016-01-21 16:10:00 898.522762 2
5 2016-01-25 00:00:00 761.063545 3
通过使用布尔系列的第一个差异(
DataFrame.diff
)来表示所使用的石斑鱼,该布尔值表示realize
是否大于0
:diff = df.realize.gt(0).astype(int).diff().fillna(0).abs()
df.assign(diff = diff, grouper = g)
time realize Consecutive diff grouper
0 2016-01-18 08:25:00 -46.369083 1 0.0 0.0
1 2016-01-19 14:30:00 -819.010738 2 0.0 0.0
2 2016-01-20 11:10:00 -424.955847 3 0.0 0.0
3 2016-01-21 07:15:00 27.523859 1 1.0 1.0
4 2016-01-21 16:10:00 898.522762 2 0.0 1.0
5 2016-01-25 00:00:00 761.063545 3 0.0 1.0