我在熊猫中练习了MulitIndex函数,但是,它没有按我预期的那样工作。我认为这是因为我的基本知识还不够。
from StringIO import StringIO # io.StringIO on python 3.X
import pandas as pd
datacsv = StringIO("""\
date,id,a,b
20150209,42366,7644,6366
20150209,52219,2741,1796
20150209,52831,163,145
20150209,53209,1047,862
20150209,53773,31343,22501
20150209,58935,16621,14873
20150209,65464,19838,12177
20150209,65823,4903,2982
20150209,68497,16564,12207
20150209,79230,48714,37355
20150208,42366,7644,6366
20150208,52219,2741,1796
20150208,52831,163,145
20150208,53209,1047,862
20150208,53773,31343,22501
20150208,58935,16621,14873
20150208,65464,19838,12177
20150208,65823,4903,2982
20150208,68497,16564,12207
20150208,79230,48714,37355"
""")
df = pd.read_csv(datacsv)
df = df.set_index(['date','id']
当前的“日期”是注释日期时间。如何将“日期”类型转换为日期时间,例如2015-02-09?
最佳答案
您可以使用pd.to_datetime
并指定格式,将“系列”(或列)转换为日期时间。
例如,可以像这样转换一系列整数,例如CSV文件中的日期:
>>> s = pd.Series([20150207, 20150208, 20150209])
>>> pd.to_datetime(s, format="%Y%m%d")
0 2015-02-07
1 2015-02-08
2 2015-02-09
dtype: datetime64[ns]
因此,要在设置索引之前更改日期列,可以编写:
df['date'] = pd.to_datetime(df['date'], format="%Y%m%d")