本文介绍了 pandas 数据帧中的字符串类型的时间增量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个数据框df
,它的第一列是timedelta64
I have a dataframe df
and its first column is timedelta64
df.info():
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 686 entries, 0 to 685
Data columns (total 6 columns):
0 686 non-null timedelta64[ns]
1 686 non-null object
2 686 non-null object
3 686 non-null object
4 686 non-null object
5 686 non-null object
例如,如果我是print(df[0][2])
,它将给我0 days 05:01:11
.但是,我不想提交0 days
.我只希望打印05:01:11
.有人可以教我该怎么做吗?非常感谢!
If I print(df[0][2])
, for example, it will give me 0 days 05:01:11
. However, I don't want the 0 days
filed. I only want 05:01:11
to be printed. Could someone teaches me how to do this? Thanks so much!
推荐答案
可以通过以下方式实现:
It is possible by:
df['duration1'] = df['duration'].astype(str).str[-18:-10]
但是解决方案并不通用,如果输入为3 days 05:01:11
,它也会删除3 days
.
But solution is not general, if input is 3 days 05:01:11
it remove 3 days
too.
因此,解决方案仅能有效地将时间间隔缩短至不到一天.
So solution working only for timedeltas less as one day correctly.
更通用的解决方案是创建自定义格式:
N = 10
np.random.seed(11230)
rng = pd.date_range('2017-04-03 15:30:00', periods=N, freq='13.5H')
df = pd.DataFrame({'duration': np.abs(np.random.choice(rng, size=N) -
np.random.choice(rng, size=N)) })
df['duration1'] = df['duration'].astype(str).str[-18:-10]
def f(x):
ts = x.total_seconds()
hours, remainder = divmod(ts, 3600)
minutes, seconds = divmod(remainder, 60)
return ('{}:{:02d}:{:02d}').format(int(hours), int(minutes), int(seconds))
df['duration2'] = df['duration'].apply(f)
print (df)
duration duration1 duration2
0 2 days 06:00:00 06:00:00 54:00:00
1 2 days 19:30:00 19:30:00 67:30:00
2 1 days 03:00:00 03:00:00 27:00:00
3 0 days 00:00:00 00:00:00 0:00:00
4 4 days 12:00:00 12:00:00 108:00:00
5 1 days 03:00:00 03:00:00 27:00:00
6 0 days 13:30:00 13:30:00 13:30:00
7 1 days 16:30:00 16:30:00 40:30:00
8 0 days 00:00:00 00:00:00 0:00:00
9 1 days 16:30:00 16:30:00 40:30:00
这篇关于 pandas 数据帧中的字符串类型的时间增量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!