我想做什么?
获取每周看到最高价格的日期和时间。
我尝试了什么?
我正在尝试使用groupby
和transform
来获取idxmax
。当将transform
与idxmax
一起使用时,我得到的输出是numpy.float64数据,但是在df上使用idxmax
时,它返回pandas.tslib.Timestamp数据。
测试数据:
rng = pd.date_range('1/1/2011', periods=100, freq='D')
np.random.seed(4)
stocks = pd.DataFrame({
'price':(np.random.randn(100).cumsum() + 10) },index = rng)
stocks['week_id'] = pd.to_datetime(stocks.index).week #used for the groupby
因此,如果我使用
groupby
来获取max
价格,它将返回我期望的数据:stocks.groupby(stocks['week_id'])['price'].transform('max')
2011-01-01 10.550513
2011-01-02 10.550513
2011-01-03 10.248203
2011-01-04 10.248203.....
如果然后尝试获取所有数据的
idxmax
:stocks['price'].idxmax()
它返回pandas.tslib.Timestamp数据:
Timestamp('2011-03-20 00:00:00', freq='D')
但是,如果我尝试使用via groupby进行转换:
stocks.groupby(stocks['week_id'])['price'].transform('idxmax')
我得到numpy.float64数据:
stocks.groupby(stocks['week_id'])['price'].transform('idxmax').head()
2011-01-01 1.293926e+18
2011-01-02 1.293926e+18
2011-01-03 1.294099e+18
2011-01-04 1.294099e+18
2011-01-05 1.294099e+18
Freq: D, Name: price, dtype: float64
所需的输出是一个新的
week_max
列,如下所示: price week_id week_high week_max
2011-01-01 10.050562 52 10.550513 2011-01-02
2011-01-02 10.550513 52 10.550513 2011-01-02
2011-01-03 9.554604 1 10.248203 2011-01-04
2011-01-04 10.248203 1 10.248203 2011-01-04
谢谢你的帮助。
最佳答案
看来是虫子。
替代解决方案是join
aggregate
ed max
和idxmax
:
rng = pd.date_range('1/1/2011', periods=15, freq='D')
np.random.seed(4)
stocks = pd.DataFrame({
'price':(np.random.randn(15).cumsum() + 10) },index = rng)
stocks['week_id'] = pd.to_datetime(stocks.index).week #used for the groupby
#print (stocks)
df1 = stocks.groupby('week_id')['price'].agg(['max', 'idxmax'])
df1.columns = ['week_high', 'week_max']
print (df1)
week_high week_max
week_id
1 10.248203 2011-01-04
2 8.668974 2011-01-14
52 10.550513 2011-01-02
print (stocks.join(df1, on='week_id'))
price week_id week_high week_max
2011-01-01 10.050562 52 10.550513 2011-01-02
2011-01-02 10.550513 52 10.550513 2011-01-02
2011-01-03 9.554604 1 10.248203 2011-01-04
2011-01-04 10.248203 1 10.248203 2011-01-04
2011-01-05 9.829901 1 10.248203 2011-01-04
2011-01-06 8.245324 1 10.248203 2011-01-04
2011-01-07 7.597617 1 10.248203 2011-01-04
2011-01-08 8.196192 1 10.248203 2011-01-04
2011-01-09 8.528442 1 10.248203 2011-01-04
2011-01-10 7.380966 2 8.668974 2011-01-14
2011-01-11 7.999635 2 8.668974 2011-01-14
2011-01-12 7.911648 2 8.668974 2011-01-14
2011-01-13 8.336721 2 8.668974 2011-01-14
2011-01-14 8.668974 2 8.668974 2011-01-14
2011-01-15 7.512158 2 8.668974 2011-01-14