本文介绍了通过 Python 根据 if 条件添加新行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果door列的两个连续单元格相同并且time列中的两个连续单元格之间的差值大于5,我需要添加一个新行分钟.df:

I need to add a new row if two consecutive cells of the column door are the same and the difference between two consecutive cells in the column time is more than 5 minutes.df:

Time        door             name
09:10:00    RDC_OUT-1        alex
09:10:00    RDC_OUT-1        alex
11:23:00    RDC_IN-1         alex
12:13:00    RDC_IN-1         alex
12:39:00    RDC_OUT-1        alex
15:23:00    RDC_IN-1         alex

代码:

import pandas as pd
import numpy as np
file_name='test.xlsx'
from datetime import timedelta
import datetime

df = pd.read_excel(file_name, header=0, index= False)
df= df.sort_values(by='Time')
df.reset_index(inplace = True)

print(df)

idx=[]
for i in range (0,len(df)):
    if i == 0:
        print ('Door Name '+str(i)+' ok')
    elif df['Door Name'][i] != df['Door Name'][i-1]:
        print('index '+str(i)+' ok')

    elif ((df['Door Name'][i] == df['Door Name'][i-1]) & ((df['Time'].iloc[i] - df['Time'].iloc[i-1]) > datetime.timedelta(minutes=5))):
        print('index '+str(i)+' ok')
        df.iloc[i] = [i,'RDC_OUT-1', str('12:00:00'), 'ARYA']



    elif ((df['Door Name'][i] == df['Door Name'][i-1]) & ((df['Time'].iloc[i] - df['Time'].iloc[i-1]) < datetime.timedelta(minutes=5))):
        print('index '+str(i)+' nok')
        idx.append(i)
        print('idx\n',idx)

df.drop(df.index[[idx]],inplace=True)
print('\n',df)

期望的输出:

Time        door              name
Time        door             name
09:10:00    RDC_OUT-1        alex
11:23:00    RDC_IN-1         alex
12:00:00    RDC_OUT-1        ARYA
12:13:00    RDC_IN-1         alex
12:39:00    RDC_OUT-1        alex
15:23:00    RDC_IN-1         alex

输出

0      4  09:10:00    RDC_OUT-1   alex
2      3  11:23:00    RDC_IN-1    alex
3      2  12:13:00    RDC_IN-1    alex
4      3  12:00:00    RDC_OUT-1   ARYA
5      0  15:23:00    RDC_IN-1    alex

推荐答案

因此,首先,我强烈建议您始终提供一个可用于复制粘贴的示例!

So first, I highly recommend you to always deliver a working example, for copy-paste!

import pandas as pd
import numpy as np
import datetime as dt

df= pd.DataFrame({'Time':['17:01:10', '13:23:00', '11:23:00', '10:01:10','09:01:10','09:01:10'],
 'door':['RDC_OUT-1', 'RDC_IN-1','RDC_IN-1','RDC_OUT-1','RDC_IN-1','RDC_IN-1'],
 'name':['alex','alex','alex','alex','alex','alex']})

然后,转换您的时间戳和特征,以便您可以对其进行数学运算:

then, convert your time stamp and features, so you can do math on it:

# replace door with bin value
df['door']= df['door'].map({'RDC_IN-1': 0, 'RDC_OUT-1': 1})
# convert time stamp
df['Time'] = pd.to_datetime(df['Time'], format="%H:%M:%S")

现在你可以释放pandas数据框的力量了;)

Now you are able to unleash the power of pandas data frame ;)

# sort by time stamp
df= df.sort_values(by='Time')

# calculate difference to next row per column
df_diff = df[['Time', 'door']].diff(periods=-1)

# select and copy relevant rows
df_add = df[(df_diff.Time < dt.timedelta(minutes=-5))& (df_diff.door ==0)].copy()

# change the time stamp of copied rows
df_add.loc[df_add.door == 0, 'Time'] =  pd.to_datetime('12:00:00', format="%H:%M:%S")
df_add.loc[df_add.door == 1, 'Time'] =  pd.to_datetime('14:00:00', format="%H:%M:%S")


# switch the label of copied rows
df_add['door']= -(df['door']-1)

# change name to mark the new
df_add['name']= 'new_alex'

# append existing data frame with new rows and sort by time stamp
df = df.append(df_add ).sort_values(by='Time')

# remap the door featuere
df['door']= df['door'].map({0:'RDC_IN-1', 1:'RDC_OUT-1'})

这应该给你输出:

                 Time       door      name
4 1900-01-01 09:01:10   RDC_IN-1      alex
5 1900-01-01 09:01:10   RDC_IN-1      alex
3 1900-01-01 10:01:10  RDC_OUT-1      alex
2 1900-01-01 11:23:00   RDC_IN-1      alex
2 1900-01-01 12:00:00  RDC_OUT-1  new_alex
1 1900-01-01 13:23:00   RDC_IN-1      alex
0 1900-01-01 17:01:10  RDC_OUT-1      alex


这篇关于通过 Python 根据 if 条件添加新行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-19 16:36