问题描述
我有一个10分钟频率数据的熊猫时间序列,需要在每个24小时内找到最大值.但是,这24小时需要每天从凌晨5点开始-而不是熊猫假设的默认午夜.
I have a pandas timeseries of 10-min freqency data and need to find the maximum value in each 24-hour period. However, this 24-hour period needs to start each day at 5AM - not the default midnight which pandas assumes.
我一直在检查DateOffset
,但是到目前为止我还是在画空白.我可能期望类似于pandas.tseries.offsets.Week(weekday=n)
的东西,例如pandas.tseries.offsets.Week(hour=5)
,但据我所知,这不受支持.
I've been checking out DateOffset
but so far am drawing blanks. I might have expected something akin to pandas.tseries.offsets.Week(weekday=n)
, e.g. pandas.tseries.offsets.Week(hour=5)
, but this is not supported as far as I can tell.
我可以先通过shift
进行数据处理,但是这很不直观,甚至在一周之后,我仍然无法将头转向移位方向!
I can do a nasty work around by shift
ing the data first, but it's unintuitive and even coming back to the same code after just a week I have problems wrapping my head around the shift direction!
任何更优雅的想法将不胜感激.
Any more elegant ideas would be much appreciated.
推荐答案
base
关键字可以解决问题(请参见文档):
The base
keyword can do the trick (see docs):
s.resample('24h', base=5)
例如:
In [35]: idx = pd.date_range('2012-01-01 00:00:00', freq='5min', periods=24*12*3)
In [36]: s = pd.Series(np.arange(len(idx)), index=idx)
In [38]: s.resample('24h', base=5)
Out[38]:
2011-12-31 05:00:00 29.5
2012-01-01 05:00:00 203.5
2012-01-02 05:00:00 491.5
2012-01-03 05:00:00 749.5
Freq: 24H, dtype: float64
这篇关于从午夜以外的其他时间重新采样每天的 pandas 时间序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!