本文介绍了插入新值的最佳方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用熊猫数据框来跟踪我将在交易日实时下载的一些市场数据.

I want to use a pandas dataframe to keep track of some market data I will be downloading live during the trading day.

假设我要记录AAPL和GOOG的价格.我首先创建一个数据框:

Let's say I want to log the prices of AAPL and GOOG. I start by creating a dataframe:

prices = DataFrame(columns = ['AAPL', 'GOOG'])

假设第一个数据点在时间t1进入,AAPL价格为555.0.然后几秒钟后,在t2,GOOG的价格为430.0.

Let's say the first datapoint comes in at at time t1 and price 555.0 for AAPL. And then a few seconds later at t2, a price of 430.0 comes in for GOOG.

当然不能:

prices['AAPL'][t1] = 555.0
prices['GOOG'][t2] = 430.0

除了拉动索引,修改索引,重新索引数据框然后插入每个标量价格外,大熊猫中是否有简便/快速的方法来实现此目的?

Is there an easy/fast way in pandas to accomplish this though besides pulling the index, modifying it, reindexing the dataframe and then inserting each scalar price as it comes in?

推荐答案

签出set_value方法(如果大小改变则返回对新对象的引用).但不要指望它会很快(与嵌套字典相比):

Check out the set_value method (which returns a reference to a new object if the size if mutated). But don't expect it to be fast (compared with a nested dict):

In [7]: prices
Out[7]:
Empty DataFrame
Columns: array([AAPL, GOOG], dtype=object)
Index: array([], dtype=object)

In [8]: prices = prices.set_value(t1, 'AAPL', 5)

In [9]: prices
Out[9]:
                            AAPL  GOOG
2012-04-12 18:02:28.178331     5   NaN

最好在某个时候添加一个方法,以通过在最后粘贴数据来更有效地调整DataFrame的大小(NumPy确实具有此功能).

It would be nice to add a method at some point for more efficiently resizing a DataFrame by gluing on data at the end (NumPy does have a facility for this).

这篇关于插入新值的最佳方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-26 08:26