

我正在阅读Abdi& Williams(2010)的主成分分析",我正在尝试重做SVD以获得进一步PCA的价值.

Im reading Abdi & Williams (2010) "Principal Component Analysis", and I'm trying to redo the SVD to attain values for further PCA.


The article states that following SVD:

X = P D Q ^ t

X = P D Q^t

我将数据加载到np.array X中.

I load my data in a np.array X.

X = np.array(data)
P, D, Q = np.linalg.svd(X, full_matrices=False)
D = np.diag(D)


But i do not get the above equality when checking with

X_a = np.dot(np.dot(P, D), Q.T)


X_a and X are the same dimensions, but the values are not the same. Am I missing something, or is the functionality of the np.linalg.svd function not compatible somehow with the equation in the paper?


TL; DR:numpy的SVD计算X = PDQ,因此Q已被转置.

TL;DR: numpy's SVD computes X = PDQ, so the Q is already transposed.


SVD decomposes the matrix X effectively into rotations P and Q and the diagonal matrix D. The version of linalg.svd() I have returns forward rotations for P and Q. You don't want to transform Q when you calculate X_a.

import numpy as np
X = np.random.normal(size=[20,18])
P, D, Q = np.linalg.svd(X, full_matrices=False)
X_a = np.matmul(np.matmul(P, np.diag(D)), Q)
print(np.std(X), np.std(X_a), np.std(X - X_a))


I get: 1.02, 1.02, 1.8e-15, showing that X_a very accurately reconstructs X.

如果您使用的是Python 3,则@运算符将实现矩阵乘法,并使代码更易于遵循:

If you are using Python 3, the @ operator implements matrix multiplication and makes the code easier to follow:

import numpy as np
X = np.random.normal(size=[20,18])
P, D, Q = np.linalg.svd(X, full_matrices=False)
X_a = P @ diag(D) @ Q
print(np.std(X), np.std(X_a), np.std(X - X_a))
print('Is X close to X_a?', np.isclose(X, X_a).all())


10-22 19:42