numpy / scipy：相关性

本文介绍了numpy / scipy：相关性的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述 29岁程序员，3月因学历无情被辞！是否有numpy / scipy中的现成函数来快速计算X和Y的相关性y = mx + o： m，m-err，o，o-err，r- coef，r-coef-err？或计算3个误差范围的公式？ -robert numpy.corrcoef只计算刚系数： >> numpy.corrcoef（（0,1,2,3.0），（2,5,6,7.0），） array（[[1.，0.95618289]， [0.95618289,1。]]） with int计算错误： >> numpy.corrcoef（（0,1,2,3），（2,5,6,7），） array（[[1 。，0.94491118]， [0.94491118,1。]]）解决方案 robert写道：是否有numpy / scipy中的现成函数来计算相关性y = mx + o一个X和Y快： m，m-err，o，o-err，r-coef，r-coef-err？ numpy和scipy问题最好在他们的名单上询问，而不是在这里。有一个的人知道numpy和scipy的功能和通过，但大多数人都没有在comp.lang.python上闲逛。 http://www.scipy.org/Mailing_Lists scipy.optimize.leastsq（）可以被告知返回估计参数的协方差矩阵（在你的例子中为m和o;我不知道你的想法 r-coeff is）。 - Robert Kern 我开始相信整个世界都是一个谜，一个无害的谜团，因为我们疯狂地试图解释它，好像它已经变得糟透了/> 一个潜在的事实。 - Umberto Eco Robert Kern写道： robert写道： >有没有准备好了使得numpy / scipy函数能够快速计算X和Y的相关性y = mx + o： m，m-err，o，o-err，r-coef，r-coef-err？ scipy.optimize.leastsq（）可以被告知返回估计的协方差矩阵参数（你的例子中的m和o;我不知道你的想法 r-coeff是）。啊，相关系数本身。由于相关系数很奇怪，受限于[-1,1]的b $ b野兽，像你这样的标准高斯错误预计为m-err和o-err不要应用。不，目前在numpy或 scipy中没有功能可以做一些足够复杂的可靠性。这是一个选项： http://www.pubmedcentral.nih.gov/art...i?artid=155684 - Robert Kern 我开始相信整个世界都是一个谜，一个无害的谜团由于我们疯狂地试图解释它而变得可怕好像它有一个潜在的真相。 - Umberto Eco robert写道：是否有numpy / scipy中的现成函数来计算X和Y的相关性y = mx + o： m，m-err，o，o-err，r-coef，r-coef-err？当然，这三个参数并不是特别有意义。如果你的模型真的是y是给出的线性响应x具有正常噪声然后 " y = m * x + o"是正确的，您可以从数据中获得的所有信息都可以在m和o的估计值以及估计值的协方差矩阵中找到。另一方面，如果你的模型是（x，y）以双变量分配正态分布，那么那么y = m * x + o这个模型不是特别好的代表。。您应该估计（x，y）的平均向量和协方差矩阵。除以边际标准偏差之后，你的相关系数将是非对角线项。这两个模型的区别在于第一个没有限制分配x。第二个做; x和y边际分布都需要正常。在第一个模型下，相关性系数没有意义。 - Robert Kern 我开始相信整个世界都是一个谜，一个无害的谜团，因为我们疯狂地试图解释它，好像它已经变得糟透了/> 一个潜在的事实。 - Umberto Eco Is there a ready made function in numpy/scipy to compute the correlation y=mx+o of an X and Y fast:m, m-err, o, o-err, r-coef,r-coef-err ?Or a formula to to compute the 3 error ranges?-robertPS:numpy.corrcoef computes only the bare coeff:>>numpy.corrcoef((0,1,2,3.0),(2,5,6,7.0),)array([[ 1. , 0.95618289],[ 0.95618289, 1. ]])with ints it goes computes wrong:>>numpy.corrcoef((0,1,2,3),(2,5,6,7),)array([[ 1. , 0.94491118],[ 0.94491118, 1. ]]) 解决方案 robert wrote:Is there a ready made function in numpy/scipy to compute the correlation y=mx+o of an X and Y fast:m, m-err, o, o-err, r-coef,r-coef-err ?numpy and scipy questions are best asked on their lists, not here. There are anumber of people who know the capabilities of numpy and scipy through andthrough, but most of them don''t hang out on comp.lang.python. http://www.scipy.org/Mailing_Listsscipy.optimize.leastsq() can be told to return the covariance matrix of theestimated parameters (m and o in your example; I have no idea what you thinkr-coeff is).--Robert Kern"I have come to believe that the whole world is an enigma, a harmless enigmathat is made terrible by our own mad attempt to interpret it as though it hadan underlying truth."-- Umberto Eco Robert Kern wrote:robert wrote:>Is there a ready made function in numpy/scipy to compute the correlation y=mx+o of an X and Y fast:m, m-err, o, o-err, r-coef,r-coef-err ?scipy.optimize.leastsq() can be told to return the covariance matrix of theestimated parameters (m and o in your example; I have no idea what you thinkr-coeff is).Ah, the correlation coefficient itself. Since correlation coefficients are weirdbeasts constrained to [-1, 1], standard gaussian errors like you are expectingfor m-err and o-err don''t apply. No, there''s currently no function in numpy orscipy that will do something sophisticated enough to be reliable. Here''s an option: http://www.pubmedcentral.nih.gov/art...i?artid=155684--Robert Kern"I have come to believe that the whole world is an enigma, a harmless enigmathat is made terrible by our own mad attempt to interpret it as though it hadan underlying truth."-- Umberto Eco robert wrote:Is there a ready made function in numpy/scipy to compute the correlation y=mx+o of an X and Y fast:m, m-err, o, o-err, r-coef,r-coef-err ?And of course, those three parameters are not particularly meaningful together.If your model is truly "y is a linear response given x with normal noise" then"y=m*x+o" is correct, and all of the information that you can get from the datawill be found in the estimates of m and o and the covariance matrix of theestimates.On the other hand, if your model is that "(x, y) is distributed as a bivariatenormal distribution" then "y=m*x+o" is not a particularly good representation ofthe model. You should instead estimate the mean vector and covariance matrix of(x, y). Your correlation coefficient will be the off-diagonal term afterdividing out the marginal standard deviations.The difference between the two models is that the first places no restrictionson the distribution of x. The second does; both the x and y marginaldistributions need to be normal. Under the first model, the correlationcoefficient has no meaning.--Robert Kern"I have come to believe that the whole world is an enigma, a harmless enigmathat is made terrible by our own mad attempt to interpret it as though it hadan underlying truth."-- Umberto Eco 这篇关于numpy / scipy：相关性的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！上岸，阿里云！