如何在 pandas 按组应用linregress

本文介绍了如何在 pandas 按组应用linregress的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想在Pandas ByGroup中应用scipy.stats.linregress.我仔细阅读了文档，但所能看到的只是如何对

I would like to apply a scipy.stats.linregress within Pandas ByGroup. I had looked through the documentation but all I could see was how to apply something to a single column like

grouped.agg(np.sum)

或类似功能

grouped.agg('D' : lambda x: np.std(x, ddof=1))

但是如何应用具有两个输入X和Y的linregress?

But how do I apply a linregress which has TWO inputs X and Y?

推荐答案

linregress 函数以及许多其他scipy/numpy函数都接受类似数组的" X和Y，Series和DataFrame都可以使用.

The linregress function, as well as many other scipy/numpy functions, accepts "array-like" X and Y, both Series and DataFrame could qualify.

例如:

from scipy.stats import linregress
X = pd.Series(np.arange(10))
Y = pd.Series(np.arange(10))

In [4]: linregress(X, Y)
Out[4]: (1.0, 0.0, 1.0, 4.3749999999999517e-80, 0.0)

事实上，能够使用scipy(和numpy)功能是熊猫杀手的功能之一！

因此，如果您有一个DataFrame，则可以在其列(为Series)上使用linregress:

So if you have a DataFrame you can use linregress on its columns (which are Series):

linregress(df['col_X'], df['col_Y'])

，如果使用groupby，您也可以类似地apply(针对每个组):

and if using a groupby you can similarly apply (to each group):

grouped.apply(lambda x: linregress(x['col_X'], x['col_Y']))

这篇关于如何在 pandas 按组应用linregress的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！