问题描述
在R中,我想使用具有分类变量的gam模型.我以为可以做到这一点(cat是类别变量).
In R, I would like to fit a gam model with categorical variables. I thought I could do it like with (cat is the categorical variable).
lm(data = df, formula = y ~ x1*cat + x2 + x3);
但是我不能做类似的事情:
But I can't do things like :
gam(data = df, formula = y ~ s(x1)*cat + s(x2) + x3)
但可以进行以下操作:
gam(data = df, formula = y ~ cat + s(x1) + s(x2) + x3)
如何将类别变量仅添加到其中一个样条线?
How do I add a categorical variable to just one of the splines?
推荐答案
其中一个评论或多或少告诉了您如何操作.使用by
变量:
One of the comments has more or less told you how. Use by
variable:
s(x1, by = cat)
这将创建因子平滑"平滑类fs
,其中将为每个因子级别创建x1
平滑函数.平滑参数也是重复的,但没有链接,因此对它们的估计不合理.您可以设置
This creates the "factor smooth" smoothing class fs
, where a smooth function of x1
is created for each factor level. Smoothing parameters are also duplicated but not linked, so they are estimated indecently. You can set
s(x1, by = cat, id = 0)
为所有子平滑"使用单个平滑参数.
to use a single smoothing parameter for all "sub smooths".
还请注意,对比度不适用于因数,但平滑函数仍受居中约束.这意味着您还需要将因子变量指定为固定效应:
Also note that contrast does not apply to factor but smooth function is still subject to centering constraint. What this means is that you need to specify factor variable as a fixed effect, too:
s(x1, by = cat) + cat
这篇关于mgcv:如何指定平滑度和因子之间的相互作用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!