本文介绍了使用glm在R中指定公式,而无需显式声明每个协变量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在不完全指定每个变量的情况下将特定变量强制转换为glm回归.我的真实数据集有大约200个变量.到目前为止,我还无法在在线搜索中找到此示例.

I would like to force specific variables into glm regressions without fully specifying each one. My real data set has ~200 variables. I haven't been able to find samples of this in my online searching thus far.

例如(仅有3个变量):

For example (with just 3 variables):

n=200
set.seed(39) 
samp = data.frame(W1 = runif(n, min = 0, max = 1), W2=runif(n, min = 0, max = 5)) 
samp = transform(samp, # add A
A = rbinom(n, 1, 1/(1+exp(-(W1^2-4*W1+1))))) 
samp = transform(samp, # add Y
Y = rbinom(n, 1,1/(1+exp(-(A-sin(W1^2)+sin(W2^2)*A+10*log(W1)*A+15*log(W2)-1+rnorm(1,mean=0,sd=.25))))))

如果我想包括所有主要术语,这有一个简单的快捷方式:

If I want to include all main terms, this has an easy shortcut:

glm(Y~., family=binomial, data=samp)

但是说我想包括所有主要术语(W1,W2和A)加上W2 ^ 2:

But say I want to include all main terms (W1, W2, and A) plus W2^2:

glm(Y~A+W1+W2+I(W2^2), family=binomial, data=samp)

这有捷径吗?

[发布前编辑自己:]可行! glm(formula = Y ~ . + I(W2^2), family = binomial, data = samp)

[editing self before publishing:] This works! glm(formula = Y ~ . + I(W2^2), family = binomial, data = samp)

好的,那这个呢!

我想省略一个主要术语变量,并且仅包括两个主要术语(A,W2)以及W2 ^ 2和W2 ^ 2:A:

I want to omit one main terms variable and include only two main terms (A, W2) and W2^2 and W2^2:A:

glm(Y~A+W2+A*I(W2^2), family=binomial, data=samp)

很明显,只有几个变量确实不需要捷径,但是我处理的是高维数据.当前数据集仅"有200个变量,而另一些则有成千上万.

Obviously with just a few variables no shortcut is really needed, but I work with high dimensional data. The current data set has "only" 200 variables, but some others have thousands and thousands.

推荐答案

创造性地使用.来构建包含所有或几乎所有变量的公式是一种好方法.有时有用的另一个选项是将公式以编程方式构建为字符串,然后使用as.formula:

Your use of . creatively to build the formula containing all or almost all variables is a good and clean approach. Another option that is useful sometimes is to build the formula programatically as a string, and then convert it to formula using as.formula:

vars <- paste("Var",1:10,sep="")
fla <- paste("y ~", paste(vars, collapse="+"))
as.formula(fla)

当然,您可以使fla对象的方式更复杂.

Of course, you can make the fla object way more complicated.

这篇关于使用glm在R中指定公式,而无需显式声明每个协变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-22 07:33