如何添加一个额外的列,该列是每个类(class)的时差的累积值?例如,初始表是:
id_A course weight ts_A value
id1 cotton 3.5 2017-04-27 01:35:30 150.000000
id1 cotton 3.5 2017-04-27 01:36:00 416.666667
id1 cotton 3.5 2017-04-27 01:36:30 700.000000
id1 cotton 3.5 2017-04-27 01:37:00 950.000000
id2 cotton blue 5.0 2017-04-27 02:35:30 150.000000
id2 cotton blue 5.0 2017-04-27 02:36:00 450.000000
id2 cotton blue 5.0 2017-04-27 02:36:30 520.666667
id2 cotton blue 5.0 2017-04-27 02:37:00 610.000000
预期结果是:
id_A course weight ts_A value cum_delta_sec
id1 cotton 3.5 2017-04-27 01:35:30 150.000000 0
id1 cotton 3.5 2017-04-27 01:36:00 416.666667 30
id1 cotton 3.5 2017-04-27 01:36:30 700.000000 60
id1 cotton 3.5 2017-04-27 01:37:00 950.000000 90
id2 cotton blue 5.0 2017-04-27 02:35:30 150.000000 0
id2 cotton blue 5.0 2017-04-27 02:36:00 450.000000 30
id2 cotton blue 5.0 2017-04-27 02:36:30 520.666667 60
id2 cotton blue 5.0 2017-04-27 02:37:00 610.000000 90
最佳答案
您可以将diff
方法与cumsum
链接起来:
# convert ts_A to datetime type
df.ts_A = pd.to_datetime(df.ts_A)
# convert ts_A to seconds, group by id and then use transform to calculate the cumulative difference
df['cum_delta_sec'] = df.ts_A.astype(int).div(10**9).groupby(df.id_A).transform(lambda x: x.diff().fillna(0).cumsum())
df
关于python - 添加额外的列作为累积时差,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/45219131/