我有以下字典:
dic = {'T1':["2013-11-12 17:35:00", "2013-11-12 17:36:00", "2013-11-12 17:37:00", "2013-11-12 17:38:00",
"2013-11-12 17:40:00", "2013-11-12 17:41:00", "2013-11-12 17:42:00"], 'T2':["2013-11-12 12:15:00", "2013-11-12 12:16:00", "2013-11-13 16:32:00", "2013-11-13 16:33:00",
"2013-11-13 16:34:00"]}
我想从中生成以下
multiIndexed
数据帧: T1 T2
Start Stop Start Stop
2013-11-12 17:35:00 2013-11-12 17:38:00 2013-11-12 12:15:00 2013-11-12 12:16:00
2013-11-12 17:40:00 2013-11-12 17:42:00 2013-11-13 16:32:00 2013-11-13 16:34:00
数据帧描述的是传感器T1或T2的某些事件开始和结束的时间。如果两个事件之间的时间差小于1分钟,则我认为这是同一事件继续发生,而当该差异大于1分钟时,则表示新事件开始。
感谢您的帮助:)
最佳答案
您可以计算连续时间戳之间的差异,并在差异不是1分钟时形成一个True掩码:
df['mask'] = (df[key].diff() / np.timedelta64(1, 'm')) != 1
然后以掩码的总和来标识哪些行属于哪个组:
df['group'] = df['mask'].cumsum()
产生类似:
T2 mask group
0 2013-11-12 12:15:00 True 1
1 2013-11-12 12:16:00 False 1
2 2013-11-13 16:32:00 True 2
3 2013-11-13 16:33:00 False 2
4 2013-11-13 16:34:00 False 2
T1 mask group
0 2013-11-12 17:38:00 True 1
1 2013-11-12 17:40:00 True 2
2 2013-11-12 17:42:00 True 3
现在按
group
列进行分组,并为每个组找到第一个和最后一个时间戳:result[key] = df.groupby(['group'])[key].agg(['first', 'last'])
import numpy as np
import pandas as pd
pd.options.display.width = 1000
dic = {'T1':["2013-11-12 17:35:00", "2013-11-12 17:36:00", "2013-11-12 17:37:00",
"2013-11-12 17:38:00", "2013-11-12 17:40:00", "2013-11-12 17:41:00",
"2013-11-12 17:42:00"],
'T2':["2013-11-12 12:15:00", "2013-11-12 12:16:00", "2013-11-13 16:32:00",
"2013-11-13 16:33:00", "2013-11-13 16:34:00"]}
result = dict()
for key, val in dic.items():
df = pd.DataFrame({key: pd.to_datetime(val)})
df['mask'] = (df[key].diff() / np.timedelta64(1, 'm')) != 1
df['group'] = df['mask'].cumsum()
result[key] = df.groupby(['group'])[key].agg(['first', 'last'])
result[key] = result[key].rename(columns={'first':'Start', 'last':'Stop'})
result = pd.concat(result, axis=1)
print(result)
产量
T1 T2
Start Stop Start Stop
group
1 2013-11-12 17:35:00 2013-11-12 17:38:00 2013-11-12 12:15:00 2013-11-12 12:16:00
2 2013-11-12 17:40:00 2013-11-12 17:42:00 2013-11-13 16:32:00 2013-11-13 16:34:00
关于python - 从具有不同长度值的字典中生成多索引数据框,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/32610129/