问题描述
我正在尝试使用矩阵的奇异值分解编写用于多元回归分析(y = Xb + e
)的函数. y
和X
必须是输入系数和回归系数矢量b
,残差矢量e
和方差必须作为R2
作为输出.到目前为止,这是我目前所拥有的,而我完全被困住了.重量的labels
部分也给我一个错误.此labels
部分是什么?有人可以给我一些提示来帮助我继续吗?
I'm trying to write a functions for multiple regression analysis (y = Xb + e
) using a singular value decomposition for matrices. y
and X
must be the input and regression coefficients vector b
, the residual vector e
and variance accounted for R2
as output. Beneath is what I have so far and I'm totally stuck. The labels
part of the weight also gives me an error. What is this labels
part? Can anybody give me some tips to help me proceed?
Test <- function(X, y) {
x <- t(A) %*% A
duv <- svd(x)
x.inv <- duv$v %*% diag(1/duv$d) %*% t(duv$u)
x.pseudo.inv <- x.inv %*% t(A)
w <- x.pseudo.inv %*% labels
return(b, e, R2)
}
推荐答案
您出门在外...奇异值分解应用于模型矩阵X
而不是法线矩阵X'X
.以下是正确的过程:
You are off the road... Singular value decomposition is applied to model matrix X
rather than normal matrix X'X
. The following is the correct procedure:
因此,在编写R函数时,我们应该这样做
So when writing an R function, we should do
svdOLS <- function (X, y) {
SVD <- svd(X)
V <- SVD$v
U <- SVD$u
D <- SVD$d
## regression coefficients `b`
## use `crossprod` for `U'y`
## use recycling rule for row rescaling of `U'y` by `D` inverse
## use `as.numeric` to return vector instead of matrix
b <- as.numeric(V %*% (crossprod(U, y) / D))
## residuals
r <- as.numeric(y - X %*% b)
## R-squared
RSS <- crossprod(r)[1]
TSS <- crossprod(y - mean(y))[1]
R2 <- 1 - RSS / TSS
## multiple return via a list
list(coefficients = b, residuals = r, R2 = R2)
}
让我们测试
## toy data
set.seed(0)
x1 <- rnorm(50); x2 <- rnorm(50); x3 <- rnorm(50); y <- rnorm(50)
X <- model.matrix(~ x1 + x2 + x3)
## fitting linear regression: y ~ x1 + x2 + x3
svdfit <- svdOLS(X, y)
#$coefficients
#[1] 0.14203754 -0.05699557 -0.01256007 0.09776255
#
#$residuals
# [1] 1.327108410 -1.400843739 -0.071885339 2.285661880 0.882041795
# [6] -0.535230752 -0.927750996 0.262674650 -0.133878558 -0.559783412
#[11] 0.264204296 -0.581468657 2.436913000 1.517601798 0.774515419
#[16] 0.447774149 -0.578988327 0.664690723 -0.511052627 -1.233302697
#[21] 1.740216739 -1.065592673 -0.332307898 -0.634125164 -0.975142054
#[26] 0.344995480 -1.748393187 -0.414763742 -0.680473508 -1.547232557
#[31] -0.383685601 -0.541602452 -0.827267878 0.894525453 0.359062906
#[36] -0.078656943 0.203938750 -0.813745178 -0.171993018 1.041370294
#[41] -0.114742717 0.034045040 1.888673004 -0.797999080 0.859074345
#[46] 1.664278354 -1.189408794 0.003618466 -0.527764821 -0.517902581
#
#$R2
#[1] 0.008276773
另一方面,我们可以使用.lm.fit
来检查正确性:
On the other hand, we can use .lm.fit
to check correctness:
qrfit <- .lm.fit(X, y)
在系数和残差上完全相同:
which is exactly the same on coefficients and residuals:
all.equal(svdfit$coefficients, qrfit$coefficients)
# [1] TRUE
all.equal(svdfit$residuals, qrfit$residuals)
# [1] TRUE
这篇关于Toy R函数,用于通过奇异值分解来求解普通最小二乘的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!