问题描述
我正在进行套索逻辑回归.我使用cv.glmnet来获取非零系数.它似乎有效,即我确实获得了一些非零系数,其余系数变为零.但是,当我使用coef函数打印所有系数时,会显示所有系数的列表.有没有一种方法可以提取不为零的系数及其名称.我所做的代码是:
I'm doing a lasso logistic regression. I've used cv.glmnet to get the non-zero coefficients. And it seems to work i.e. I do get some non-zero coefficients and the rest go to zero. However, when I use coef function to print all coefficients it gives me a list of all coefficients. Is there a way to extract coefficients and their names that are not zero. The code of what I've done is:
cv.lasso = cv.glmnet(x_train,y_train, alpha = 0.6, family = "binomial")
coef(cv.lasso, s=cv.lasso$lambda.1se)
当我使用coef时,我得到以下输出:
When I use coef I get following output:
4797 x 1 sparse Matrix of class "dgCMatrix"
1
(Intercept) -1.845702
sampleid.10 .
sampleid.1008 .
我想提取非零系数的名称和值.我该怎么办?
I want to extract the name and value of non zero coefficients. How can I do that?
推荐答案
一种非常方便的方法是 coefplot
包.
A very convenient way to do so is the extract.coef
function of the coefplot
package.
这是一个简单的可复制示例,根据cv.glmnet
文档改编而成:
Here is a simple reproducible example, adapted from the cv.glmnet
docs:
library(glmnet)
library(coefplot)
set.seed(1010)
n=1000;p=100
nzc=trunc(p/10)
x=matrix(rnorm(n*p),n,p)
beta=rnorm(nzc)
fx= x[,seq(nzc)] %*% beta
eps=rnorm(n)*5
y=drop(fx+eps)
px=exp(fx)
px=px/(1+px)
ly=rbinom(n=length(px),prob=px,size=1)
set.seed(1011)
# model:
cvob1=cv.glmnet(x,y)
此处x
具有100个变量,从V1到V100;其中哪个系数非零?
Here x
has 100 variables, V1 to V100; which of them have non-zero coefficients?
extract.coef(cvob1)
# result:
Value SE Coefficient
(Intercept) -0.11291017 NA (Intercept)
V1 -0.41095526 NA V1
V2 0.50127803 NA V2
V4 -0.40319404 NA V4
V5 -0.42518885 NA V5
V6 0.42609526 NA V6
V7 0.41845873 NA V7
V8 -1.54881117 NA V8
V9 1.23284876 NA V9
V10 0.31187777 NA V10
V14 -0.03085618 NA V14
V18 -0.15211282 NA V18
V26 0.19704039 NA V26
V30 -0.11568702 NA V30
V31 -0.07108829 NA V31
V36 0.15282509 NA V36
V39 0.10250912 NA V39
V47 -0.02602025 NA V47
V60 0.04502238 NA V60
V63 -0.07051392 NA V63
V68 0.06431373 NA V68
V75 -0.35798561 NA V75
这篇关于在R中的glmnet中提取非零系数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!