apply将两列输入应用于函数

apply将两列输入应用于函数

本文介绍了Python Pandas rolling_apply将两列输入应用于函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

紧随此问题之后使用rolling_apply用于熊猫的Python自定义函数,有关使用rolling_apply.尽管我的函数已经取得了进步,但我仍在努力处理需要两列或更多列作为输入的函数:

Following on from this question Python custom function using rolling_apply for pandas, about using rolling_apply. Although I have progressed with my function, I am struggling to deal with a function that requires two or more columns as inputs:

创建与之前相同的设置

import pandas as pd
import numpy as np
import random

tmp  = pd.DataFrame(np.random.randn(2000,2)/10000,
                    index=pd.date_range('2001-01-01',periods=2000),
                    columns=['A','B'])

但是稍微更改功能以占用两列.

But changing the function slightly to take two columns.

def gm(df,p):
    df = pd.DataFrame(df)
    v =((((df['A']+df['B'])+1).cumprod())-1)*p
    return v.iloc[-1]

它会产生以下错误:

pd.rolling_apply(tmp,50,lambda x: gm(x,5))

  KeyError: u'no item named A'

我认为这是因为lambda函数的输入是一个长度为50且仅第一列的ndarray,并且没有采用两列作为输入.有没有一种方法可以将两列都用作输入并在rolling_apply函数中使用它.

I think it is because the input to the lambda function is an ndarray of length 50 and only of the first column, and doesn't take two columns as the input. Is there a way to get both columns as inputs and use it in a rolling_apply function.

再次获得帮助将不胜感激...

Again any help would be greatly appreciated...

推荐答案

类似rolling_apply会尝试将用户func的输入转换为ndarray( http://pandas.pydata.org/pandas-docs/stable/produced/pandas.stats.moments.rolling_apply.html?highlight = rolling_apply#pandas.stats.moments.rolling_apply ).

Looks like rolling_apply will try to convert input of user func into ndarray (http://pandas.pydata.org/pandas-docs/stable/generated/pandas.stats.moments.rolling_apply.html?highlight=rolling_apply#pandas.stats.moments.rolling_apply).

基于使用辅助列 ii 的解决方法,该列用于选择操作函数gm内的窗口:

Workaround based on using aux column ii which is used to select window inside of manipulating function gm:

import pandas as pd
import numpy as np
import random

tmp = pd.DataFrame(np.random.randn(2000,2)/10000, columns=['A','B'])
tmp['date'] = pd.date_range('2001-01-01',periods=2000)
tmp['ii'] = range(len(tmp))

def gm(ii, df, p):
    x_df = df.iloc[map(int, ii)]
    #print x_df
    v =((((x_df['A']+x_df['B'])+1).cumprod())-1)*p
    #print v
    return v.iloc[-1]

#print tmp.head()
res = pd.rolling_apply(tmp.ii, 50, lambda x: gm(x, tmp, 5))
print res

这篇关于Python Pandas rolling_apply将两列输入应用于函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-20 10:24