问题描述
我想使用熊猫数据框来跟踪我将在交易日实时下载的一些市场数据.
I want to use a pandas dataframe to keep track of some market data I will be downloading live during the trading day.
假设我要记录AAPL和GOOG的价格.我首先创建一个数据框:
Let's say I want to log the prices of AAPL and GOOG. I start by creating a dataframe:
prices = DataFrame(columns = ['AAPL', 'GOOG'])
假设第一个数据点在时间t1进入,AAPL价格为555.0.然后几秒钟后,在t2,GOOG的价格为430.0.
Let's say the first datapoint comes in at at time t1 and price 555.0 for AAPL. And then a few seconds later at t2, a price of 430.0 comes in for GOOG.
当然不能:
prices['AAPL'][t1] = 555.0
prices['GOOG'][t2] = 430.0
除了拉动索引,修改索引,重新索引数据框然后插入每个标量价格外,大熊猫中是否有简便/快速的方法来实现此目的?
Is there an easy/fast way in pandas to accomplish this though besides pulling the index, modifying it, reindexing the dataframe and then inserting each scalar price as it comes in?
推荐答案
签出set_value
方法(如果大小改变则返回对新对象的引用).但不要指望它会很快(与嵌套字典相比):
Check out the set_value
method (which returns a reference to a new object if the size if mutated). But don't expect it to be fast (compared with a nested dict):
In [7]: prices
Out[7]:
Empty DataFrame
Columns: array([AAPL, GOOG], dtype=object)
Index: array([], dtype=object)
In [8]: prices = prices.set_value(t1, 'AAPL', 5)
In [9]: prices
Out[9]:
AAPL GOOG
2012-04-12 18:02:28.178331 5 NaN
最好在某个时候添加一个方法,以通过在最后粘贴数据来更有效地调整DataFrame的大小(NumPy确实具有此功能).
It would be nice to add a method at some point for more efficiently resizing a DataFrame by gluing on data at the end (NumPy does have a facility for this).
这篇关于插入新值的最佳方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!