以下是我正在使用的示例数据集:

            maint id
datetime
2015-01-01    1.0  a
2015-01-02    NaN  a
2015-01-03    NaN  a
2015-01-04    1.0  a
2015-01-05    NaN  a
2015-01-06    NaN  a
2015-01-07    NaN  a
2015-01-01    NaN  b
2015-01-02    NaN  b
2015-01-03    1.0  b
2015-01-04    1.0  b
2015-01-05    NaN  b
2015-01-06    NaN  b
2015-01-07    NaN  b


我想得到的是天差,因为df['maint']为1。

            maint id  days
datetime
2015-01-01    1.0  a     0
2015-01-02    NaN  a     1
2015-01-03    NaN  a     2
2015-01-04    1.0  a     0
2015-01-05    NaN  a     1
2015-01-06    NaN  a     2
2015-01-07    NaN  a     3
2015-01-01    NaN  b     0
2015-01-02    NaN  b     0
2015-01-03    1.0  b     0
2015-01-04    1.0  b     0
2015-01-05    NaN  b     1
2015-01-06    NaN  b     2
2015-01-07    NaN  b     3


因为我有数千个不同的ID,并且每个ID都有几年的维护记录。我想找到一种计算日差的有效方法。

最佳答案

使用:

df['days'] = df.index.where(df['maint'].eq(1))
df['days'] = (df.index - df.groupby('id')['days'].ffill()).fillna(pd.Timedelta(0)).dt.days
print (df)
            maint id  days
datetime
2015-01-01    1.0  a     0
2015-01-02    NaN  a     1
2015-01-03    NaN  a     2
2015-01-04    1.0  a     0
2015-01-05    NaN  a     1
2015-01-06    NaN  a     2
2015-01-07    NaN  a     3
2015-01-01    NaN  b     0
2015-01-02    NaN  b     0
2015-01-03    1.0  b     0
2015-01-04    1.0  b     0
2015-01-05    NaN  b     1
2015-01-06    NaN  b     2
2015-01-07    NaN  b     3


说明:


首先用值days创建一个新列df.index,其中maint1,另一个值是NaT
index创建的新系列减去GroupBy.ffill,将NaN替换为0 timedelta,最后用Series.dt.days将它们转换为天

关于python - 计算自上次维护以来的日期差的有效方法是什么?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/55085235/

10-11 21:37