numpy cov(协方差)函数，它究竟计算什么?

本文介绍了numpy cov(协方差)函数，它究竟计算什么?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我假设numpy.cov(X)计算样本协方差矩阵为:

I assume numpy.cov(X) computes the sample covariance matrix as:

1/(N-1) * Sum (x_i - m)(x_i - m)^T (where m is the mean)

即外部产品的总和.但是文档中没有任何地方真正说出这一点，只是说估计协方差矩阵".

I.e sum of outer products. But nowhere in the documentation does it actually say this, it just says "Estimate a covariance matrix".

任何人都可以确认这是否是内部的吗? (我知道我可以使用bias参数从前面更改常量.)

Can anyone confirm whether this is what it does internally? (I know I can change the constant out the front with the bias parameter.)

推荐答案

您可以看到源，在最简单的情况下，没有掩码，并且每个N变量均带有M个样本，它会返回(N, N)协方差矩阵，其计算方式如下:

As you can see looking at the source, in the simplest case with no masks, and N variables with M samples each, it returns the (N, N) covariance matrix calculated as:

(x-m) * (x-m).T.conj() / (N - 1)

其中*表示矩阵乘积

大致实现为:

X -= X.mean(axis=0)
N = X.shape[1]

fact = float(N - 1)

return dot(X, X.T.conj()) / fact

如果您要查看源代码，请看看这里，而不是E先生的链接，除非您对屏蔽数组感兴趣.如您所述，文档并不好.

If you want to review the source, look here instead of the link from Mr E unless you're interested in masked arrays. As you mentioned, the documentation isn't great.

在这种情况下有效地(但不完全是)外部乘积，因为(x-m)具有长度为M的N列向量，因此(x-m).T一样多行向量.最终结果是所有外部乘积的总和.如果顺序相反，则相同的*将给出内部(标量)乘积.但是，从技术上讲，它们都是标准矩阵乘法，真正的外部乘积只是列向量与行向量的乘积.

which in this case is effectively (but not exactly) the outer product because (x-m) has N column vectors of length M and thus (x-m).T is as many row vectors. The end result is the sum of all the outer products. The same * will give the inner (scalar) product if the order is reversed. But, technically these are both just standard matrix multiplications and the true outer product is only the product of a column vector onto a row vector.

这篇关于numpy cov(协方差)函数，它究竟计算什么?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！

Covariance