我试图逐列在数据帧上运行np的相关函数,但是我要运行的相关性是序列本身。例如,假设df是我们的数据帧,而ts是df的第一列。我想打电话给np.correlate(ts, ts, method="full")

df = pd.DataFrame([[1,1],[2,2],[3,3],[4,4],[5,5]], index=range(5), columns=list("ab"))

def acf(R):
    """
    Calcualte the auto correlation function of a series with lag 0 up to the length
    of the series.
    """
    y = R - R.mean()
    result = y.apply(np.correlate, (y, "full"))
    result = result[len(result)//2:]
    result /= result[0]
    return result

acf(df)

NameError: name 'y' is not defined


我应该怎么做才能做到这一点?

最佳答案

pandas.Series对象通常与numpy函数配合使用,因此将函数定义为

def acf(R):
    """
    Calcualte the auto correlation function of a series with lag 0 up to the length
    of the series.
    """
    y = R - R.mean()
    result = np.correlate(y, y, 'full')
    result = result[len(result)//2:]
    result /= result[0]
    return result


然后使用DataFrame将其应用于df.apply(acf)应该可以。

In [4]: import numpy as np

In [5]: import pandas as pd
   ...: def acf(R):
   ...:     """
   ...:     Calcualte the auto correlation function of a series with lag 0    up to the length
   ...:     of the series.
   ...:     """
   ...:     y = R - R.mean()
   ...:     result = np.correlate(y, y, 'full')
   ...:     result = result[len(result)//2:]
   ...:     result /= result[0]
   ...:     return result
   ...: df = pd.DataFrame([[1,1],[2,2],[3,3],[4,4],[5,5]], index=range(5), columns=list("ab"))
   ...:

In [6]: df.apply(acf)
Out[6]:
     a    b
0  1.0  1.0
1  0.4  0.4
2 -0.1 -0.1
3 -0.4 -0.4
4 -0.4 -0.4

关于python - 如何应用以调用方为参数的函数,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/21518260/

10-12 16:50
查看更多