我试图逐列在数据帧上运行np的相关函数,但是我要运行的相关性是序列本身。例如,假设df是我们的数据帧,而ts是df的第一列。我想打电话给np.correlate(ts, ts, method="full")
。
df = pd.DataFrame([[1,1],[2,2],[3,3],[4,4],[5,5]], index=range(5), columns=list("ab"))
def acf(R):
"""
Calcualte the auto correlation function of a series with lag 0 up to the length
of the series.
"""
y = R - R.mean()
result = y.apply(np.correlate, (y, "full"))
result = result[len(result)//2:]
result /= result[0]
return result
acf(df)
NameError: name 'y' is not defined
我应该怎么做才能做到这一点?
最佳答案
pandas.Series
对象通常与numpy
函数配合使用,因此将函数定义为
def acf(R):
"""
Calcualte the auto correlation function of a series with lag 0 up to the length
of the series.
"""
y = R - R.mean()
result = np.correlate(y, y, 'full')
result = result[len(result)//2:]
result /= result[0]
return result
然后使用
DataFrame
将其应用于df.apply(acf)
应该可以。In [4]: import numpy as np
In [5]: import pandas as pd
...: def acf(R):
...: """
...: Calcualte the auto correlation function of a series with lag 0 up to the length
...: of the series.
...: """
...: y = R - R.mean()
...: result = np.correlate(y, y, 'full')
...: result = result[len(result)//2:]
...: result /= result[0]
...: return result
...: df = pd.DataFrame([[1,1],[2,2],[3,3],[4,4],[5,5]], index=range(5), columns=list("ab"))
...:
In [6]: df.apply(acf)
Out[6]:
a b
0 1.0 1.0
1 0.4 0.4
2 -0.1 -0.1
3 -0.4 -0.4
4 -0.4 -0.4
关于python - 如何应用以调用方为参数的函数,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/21518260/