是否有任何现有的库能够将datetime列分隔为仅包含一个变量的列,例如年,月,日,时,分等。
我这样做是作为我打算尝试使用机器学习的数据的预处理步骤(Kaggle纽约出租车费)。
这是数据集中的datetime列的样子:
我已经可以使用以下方法做到这一点:
df_raw["pickup_year"] = df_raw['pickup_datetime'].dt.year
df_raw["pickup_month"] = df_raw['pickup_datetime'].dt.month
df_raw["pickup_day"] = df_raw['pickup_datetime'].dt.day
df_raw["pickup_hour"] = df_raw['pickup_datetime'].dt.hour
df_raw["pickup_minute"] = df_raw['pickup_datetime'].dt.minute
df_raw["pickup_second"] = df_raw['pickup_datetime'].dt.second
df_raw["pickup_dayofyear"] = df_raw['pickup_datetime'].dt.dayofyear
df_raw["pickup_week"] = df_raw['pickup_datetime'].dt.week
df_raw["pickup_weekofyear"] = df_raw['pickup_datetime'].dt.weekofyear
df_raw["pickup_dayofweek"] = df_raw['pickup_datetime'].dt.dayofweek
df_raw["pickup_weekday"] = df_raw['pickup_datetime'].dt.weekday
df_raw["pickup_quarter"] = df_raw['pickup_datetime'].dt.quarter
df_raw.head()
但是我想肯定是在以前某个地方的图书馆里完成的吗?
最佳答案
您可以按属性列表循环并按getattr
创建新列:
L = ['year', 'month', 'day', 'hour', 'minute', 'second', 'dayofyear',
'week', 'weekofyear', 'dayofweek', 'weekday', 'quarter']
for i in L:
df[i] = getattr(df['Dates'].dt, i)
#jpp data sample
print (df)
Dates year month day hour minute second dayofyear \
0 2017-12-11 01:00:00 2017 12 11 1 0 0 345
1 2017-12-12 01:00:01 2017 12 12 1 0 1 346
2 2019-05-12 15:15:00 2019 5 12 15 15 0 132
3 2019-06-22 03:25:14 2019 6 22 3 25 14 173
4 2020-05-11 04:40:02 2020 5 11 4 40 2 132
5 2020-11-30 01:00:00 2020 11 30 1 0 0 335
week weekofyear dayofweek weekday quarter
0 50 50 0 0 4
1 50 50 1 1 4
2 19 19 6 6 2
3 25 25 5 5 2
4 20 20 0 0 2
5 49 49 0 0 4
关于python - Python Pandas Dataframe datatime列分离功能,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/51799339/