我有以下函数,用于比较特定时间是否在两个值之间
def is_time_between(begin_time, end_time, check_time=None):
# If check time is not given, default to current UTC time
check_time = check_time or datetime.utcnow().time()
if begin_time < end_time:
return check_time >= begin_time and check_time <= end_time
else: # crosses midnight
return check_time >= begin_time or check_time <= end_time
该功能工作正常。我想使用以下函数来比较时间值(如果数据框并根据此条件填充其他列),如下所示
if is_time_between(time(5,0), time(12,59),df.time):
df['day_interval'] = 1
elif is_time_between(time(13,0), time(17,59),df['time']):
df['day_interval'] = 2
elif is_time_between(time(18,0), time(23,59),df['time']):
df['day_interval'] = 3
else:
df['day_interval']= 4
运行以下代码会引发以下错误
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
最佳答案
将numpy.select
和Series.apply
用于按列值的返回掩码:
df = pd.DataFrame({'date':['2019-10-1 01:00:10',
'2019-10-2 14:00:10',
'2019-10-31 19:00:10',
'2019-10-31 06:00:10']})
df['time'] = pd.to_datetime(df['date']).dt.time
print(df)
date time
0 2019-10-1 01:00:10 01:00:10
1 2019-10-2 14:00:10 14:00:10
2 2019-10-31 19:00:10 19:00:10
3 2019-10-31 06:00:10 06:00:10
m1 = df['time'].apply(lambda x: is_time_between(time(5,0), time(12,59), x))
m2 = df['time'].apply(lambda x: is_time_between(time(13,0), time(17,59), x))
m3 = df['time'].apply(lambda x: is_time_between(time(18,0), time(23,59), x))
df['day_interval'] = np.select([m1, m2, m3], [1,2,3], default=4)
使用
cut
的另一种解决方案,并通过to_timedelta
将时间转换为时间增量:bins = pd.to_timedelta(['00:00:00','05:00:00','13:00:00','18:00:00','23:59:59'])
df['day_interval1'] = pd.cut(pd.to_timedelta(df['time'].astype(str)), bins, labels=[4,1,2,3])
print (df)
date time day_interval day_interval1
0 2019-10-1 01:00:10 01:00:10 4 4
1 2019-10-2 14:00:10 14:00:10 2 2
2 2019-10-31 19:00:10 19:00:10 3 3
3 2019-10-31 06:00:10 06:00:10 1 1
关于python - 在根据条件填充数据框时出错,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/58413695/