问题描述
我了解使用一列数据对熊猫中的时间序列数据进行OHLC重采样将非常有效,例如在以下数据帧上进行操作:
I understand that OHLC re-sampling of time series data in Pandas, using one column of data, will work perfectly, for example on the following dataframe:
>>df
ctime openbid
1443654000 1.11700
1443654060 1.11700
...
df['ctime'] = pd.to_datetime(df['ctime'], unit='s')
df = df.set_index('ctime')
df.resample('1H', how='ohlc', axis=0, fill_method='bfill')
>>>
open high low close
ctime
2015-09-30 23:00:00 1.11700 1.11700 1.11687 1.11697
2015-09-30 24:00:00 1.11700 1.11712 1.11697 1.11697
...
但是如果数据已经是OHLC格式怎么办?根据我的收集,API的OHLC方法为每列计算OHLC切片,因此,如果我的数据采用以下格式:
But what do I do if the data is already in an OHLC format? From what I can gather the OHLC method of the API calculates an OHLC slice for every column, hence if my data is in the format:
ctime openbid highbid lowbid closebid
0 1443654000 1.11700 1.11700 1.11687 1.11697
1 1443654060 1.11700 1.11712 1.11697 1.11697
2 1443654120 1.11701 1.11708 1.11699 1.11708
当我尝试重新采样时,每个列都会得到OHLC,就像这样:
When I try to re-sample I get an OHLC for each of the columns, like so:
openbid highbid \
open high low close open high
ctime
2015-09-30 23:00:00 1.11700 1.11700 1.11700 1.11700 1.11700 1.11712
2015-09-30 23:01:00 1.11701 1.11701 1.11701 1.11701 1.11708 1.11708
...
lowbid \
low close open high low close
ctime
2015-09-30 23:00:00 1.11700 1.11712 1.11687 1.11697 1.11687 1.11697
2015-09-30 23:01:00 1.11708 1.11708 1.11699 1.11699 1.11699 1.11699
...
closebid
open high low close
ctime
2015-09-30 23:00:00 1.11697 1.11697 1.11697 1.11697
2015-09-30 23:01:00 1.11708 1.11708 1.11708 1.11708
是否有一种快速的解决方法,有人愿意分享,而我不必深陷熊猫手册?
Is there a quick(ish) workaround for this that someone is willing to share please, without me having to get knee-deep in pandas manual?
谢谢。
ps,有这个答案--但那是4年前的事了,所以我希望
ps, there is this answer - Converting OHLC stock data into a different timeframe with python and pandas - but it was 4 years ago, so I am hoping there has been some progress.
推荐答案
这类似于您链接的答案,但是它更干净,更快捷,因为
This is similar to the answer you linked, but it a little cleaner, and faster, because it uses the optimized aggregations, rather than lambdas.
请注意, resample(...)。agg(...)
语法要求熊猫版本 0.18.0
。
Note that the resample(...).agg(...)
syntax requires pandas version 0.18.0
.
In [101]: df.resample('1H').agg({'openbid': 'first',
'highbid': 'max',
'lowbid': 'min',
'closebid': 'last'})
Out[101]:
lowbid highbid closebid openbid
ctime
2015-09-30 23:00:00 1.11687 1.11712 1.11708 1.117
这篇关于OHLC数据上的 pandas OHLC聚合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!