本文介绍了 pandas :TimeGrouper的文档在哪里?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我经常使用Pandas,而且效果很好.我也使用TimeGrouper,它很棒.我实际上不知道关于TimeGrouper的文档在哪里.有吗?

I use Pandas a lot and its great. I use TimeGrouper as well, and its great. I actually dont know where is the documentation about TimeGrouper. Is there any?

谢谢!

推荐答案

pd.TimeGrouper()是,而推荐使用 pd.Grouper() .

pd.TimeGrouper() was formally deprecated in pandas v0.21.0 in favor of pd.Grouper().

当您还要对非datetime列进行分组时,pd.Grouper()的最佳用法是在groupby()中.如果您只需要按频率分组,请使用resample().

The best use of pd.Grouper() is within groupby() when you're also grouping on non-datetime-columns. If you just need to group on a frequency, use resample().

例如,假设您拥有:

>>> import pandas as pd
>>> import numpy as np
>>> np.random.seed(444)

>>> df = pd.DataFrame({'a': np.random.choice(['x', 'y'], size=50),
                       'b': np.random.rand(50)},
                      index=pd.date_range('2010', periods=50))
>>> df.head()
            a         b
2010-01-01  y  0.959568
2010-01-02  x  0.784837
2010-01-03  y  0.745148
2010-01-04  x  0.965686
2010-01-05  y  0.654552

可以做:

>>> # `a` is dropped because it is non-numeric
>>> df.groupby(pd.Grouper(freq='M')).sum()
                  b
2010-01-31  18.5123
2010-02-28   7.7670

但是上面的内容是不必要的,因为您只是在索引上分组.相反,您可以这样做:

But the above is a little unnecessary because you're only grouping on the index. Instead you could do:

>>> df.resample('M').sum()
                    b
2010-01-31  16.168086
2010-02-28   9.433712

产生相同的结果.

相反,在这种情况下,Grouper()会很有用:

Conversely, here's a case where Grouper() would be useful:

>>> df.groupby([pd.Grouper(freq='M'), 'a']).sum()
                   b
           a
2010-01-31 x  8.9452
           y  9.5671
2010-02-28 x  4.2522
           y  3.5148

有关更多详细信息,请参阅Ted Petrou的 熊猫食谱 .

For some more detail, take a look at Chapter 7 of Ted Petrou's Pandas Cookbook.

这篇关于 pandas :TimeGrouper的文档在哪里?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-05 08:57