索引与MultiIndex的Pandas

索引与MultiIndex的Pandas

本文介绍了索引与MultiIndex的Pandas Dataframe日期时间切片的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用单个索引的数据帧,我可以执行以下操作:

With single indexed dataframe I can do the following:

df2 = DataFrame(data={'data': [1,2,3]},
                index=Index([dt(2016,1,1),
                      dt(2016,1,2),
                      dt(2016,2,1)]))

>>> df2['2016-01 : '2016-01']
                data
    2016-01-01     1
    2016-01-02     2

>>> df2['2016-01-01' : '2016-01-01']
                data
    2016-01-01     1

日期时间切片在您提供一整天(即2016年1月1日)时有效,并且在您给其部分日期(例如年份和月份(2016年1月))时也适用.所有这些都很好,但是当您引入多索引时,它仅适用于完整日期.部分日期切片似乎不再起作用

Date time slicing works when you give it a complete day (i.e. 2016-01-01), and it also works when you give it a partial date, like just the year and month (2016-01). All this works great, but when you introduce a multiindex, it only works for complete dates. The partial date slicing doesn't seem to work anymore

df = DataFrame(data={'data': [1, 2, 3]},
               index=MultiIndex.from_tuples([(dt(2016, 1, 1), 2),
                                             (dt(2016, 1, 1), 3),
                                             (dt(2016, 1, 2), 2)],
                                             names=['date', 'val']))


 >>> df['2016-01-01 : '2016-01-02']
                            data
     date       val
     2016-01-01 2           1
                3           2
     2016-01-02 2           3

好的,那很好,但是不完整的日期是

ok, thats fine, but the partial date:

>>> df['2016-01' : '2016-01']
 File "pandas/index.pyx", line 134, in pandas.index.IndexEngine.get_loc      (pandas/index.c:3824)
 File "pandas/index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas/index.c:3704)
 File "pandas/hashtable.pyx", line 686, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12280)
 File "pandas/hashtable.pyx", line 694, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12231)
  KeyError: '2016-01'

(我缩短了回溯).

知道这是否可行吗?这是一个错误吗?有什么办法可以做我想做的事而不必求助于诸如此类的事情:

Any idea if this is possible? Is this a bug? Is there any way to do what I want to do without having to resort to something like:

df.loc[(df.index.get_level_values('date') >= start_date) &
       (df.index.get_level_values('date') <= end_date)]

任何提示,评论,建议等都非常感谢!我已经尝试了许多其他方法,但无济于事!

Any tips, comments, suggestions, etc are MOST appreciated! I've tried a lot of other things to no avail!

推荐答案

横断面应该起作用:

df.xs(slice('2016-01-01', '2016-01-01'), level='date')

文档: http://pandas.pydata. org/pandas-docs/stable/generation/pandas.DataFrame.xs.html

这篇关于索引与MultiIndex的Pandas Dataframe日期时间切片的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-11 13:25