groupby具有重叠的间隔时间 | groupby具有重叠的间隔时间

本文介绍了groupby具有重叠的间隔时间的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在python熊猫数据帧对象中有一个时间序列，我想基于索引创建一个组，但是我想要重叠的组，即组是不区分的。 header_sec是索引列。
每组由2秒窗口组成。
输入dataFrame

  header_sec 
 1 17004天22:17:13 
 2 17004天22:17:13 
 3 17004天22:17:13 
 4 17004天22:17:13 
 5 17004天22:17:14 
 6 17004天22： 17:14 
 7 17004天22:17:14 
 8 17004天22:17:14 
 9 17004天22:17:15 
 10 17004天22:17： 15 
 11 17004天22:17:15 
 12 17004天22:17:15 
 13 17004天22:17:16 
 14 17004天22:17:16 
 15 17004天22:17:16 
 16 17004天22:17:16 
 17 17004天22:17:17 
 18 17004天22:17:17 
 19 17004天22:17:17 
 20 17004天22:17:17

我的第一组应该有

  1 17004天22:17:13 
 2 17004天22:17:13 
 3 17004天22:17:13 
 4 17004天22:17:13 
 5 17004天22:17:14 
 6 17004天22:17:14 
 7 17004天22:17:14 
 8 17004天22:17:14

第二组开始m前一个指数，并取上一个第二个记录的1/2。

  7 17004天22:17:14 
 8 17004天22:17:14 
 9 17004天22:17:15 
 10 17004天22:17:15 
 11 17004天22:17:15 
 12 17004天22:17:15 
 13 17004天22:17:16 
 14 17004天22:17:16

第三组.....

  13 17004天22:17:16 
 14 17004天22:17:16 
 15 17004天22:17:16 
 16 17004天22:17:16 
 17 17004天22:17:17 
 18 17004天22:17:17 
 19 17004天22:17:17 
 20 17004天22:17:17

如果我在索引上执行groupby，

  dfgroup = df.groupby df.index）

每秒给一组。什么是合并这些组的最佳方式？

解决方案

这是一种技巧：

  import numpy as np＃如果你还没有这个
 
 grouping = df.groupby（df.index）
 
的名字，组中的组：
 try：
 prev_sec = df.loc [（name  -  pd.to_timedelta（1，unit ='s'）），：] 
 except KeyError：
 prev_sec = pd.DataFrame（columns = group.columns）
 try：
 next_sec = df.loc [（name + pd.to_timedelta（1，unit ='s'）） ，：] 
除了KeyError：
 next_sec = pd.DataFrame（columns = group.columns）
 Pn = 2＃用int（len（prev_sec）/ 2）替换为半行从以前的第二个
 Nn = 2＃替换为int（len（next_sec）/ 2）从下一个秒获取半行
 group = pd.concat（[prev_sec.iloc [-Pn：，： ]，group，next_sec.iloc [：Nn ,:]]）
 
＃用操作替换以下行
 print（name，group）

I have a time series in python pandas dataframe object and I want to create a group based on index but I want overlapping groups i.e groups are not distinct. The header_sec is the index column.Each groups consists of a 2 second window. Input dataFrame

    header_sec
1  17004 days 22:17:13
2  17004 days 22:17:13
3  17004 days 22:17:13
4  17004 days 22:17:13
5  17004 days 22:17:14
6  17004 days 22:17:14
7  17004 days 22:17:14
8  17004 days 22:17:14
9  17004 days 22:17:15
10 17004 days 22:17:15
11 17004 days 22:17:15
12 17004 days 22:17:15
13 17004 days 22:17:16
14 17004 days 22:17:16
15 17004 days 22:17:16
16 17004 days 22:17:16
17 17004 days 22:17:17
18 17004 days 22:17:17
19 17004 days 22:17:17
20 17004 days 22:17:17

My first group should have

1  17004 days 22:17:13
2  17004 days 22:17:13
3  17004 days 22:17:13
4  17004 days 22:17:13
5  17004 days 22:17:14
6  17004 days 22:17:14
7  17004 days 22:17:14
8  17004 days 22:17:14

The second group starts from the previous index and takes 1/2 of the records in previous second.

7  17004 days 22:17:14
8  17004 days 22:17:14
9  17004 days 22:17:15
10 17004 days 22:17:15
11 17004 days 22:17:15
12 17004 days 22:17:15
13 17004 days 22:17:16
14 17004 days 22:17:16

Third group .....

13 17004 days 22:17:16
14 17004 days 22:17:16
15 17004 days 22:17:16
16 17004 days 22:17:16
17 17004 days 22:17:17
18 17004 days 22:17:17
19 17004 days 22:17:17
20 17004 days 22:17:17

If I do groupby on index,

  dfgroup=df.groupby(df.index)

this gives one group per second. What would be the best way to merge these groups?

解决方案

Here is a technique:

import numpy as np # if you have not already done this

grouped = df.groupby(df.index)

for name, group in grouped:
    try:
        prev_sec = df.loc[(name - pd.to_timedelta(1, unit='s')), :]
    except KeyError:
        prev_sec = pd.DataFrame(columns=group.columns)
    try:
        next_sec = df.loc[(name + pd.to_timedelta(1, unit='s')), :]
    except KeyError:
        next_sec = pd.DataFrame(columns=group.columns)
    Pn = 2 # replace this with int(len(prev_sec)/2) to get half rows from previous second
    Nn = 2 # replace this with int(len(next_sec)/2) to get half rows from next second
    group = pd.concat([prev_sec.iloc[-Pn:,:], group, next_sec.iloc[:Nn,:]])

    # Replace the below lines with your operations
    print(name, group)

这篇关于groupby具有重叠的间隔时间的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！