问题描述
在我的实验中,我修剪了植物并测量了它们的反应,例如在季节结束时产生的叶子质量.我操纵了剪裁强度和剪裁时间,并交叉了这两种处理方式.我还包括一个控制剪裁处理,导致 5 种不同的剪裁处理组合.每次处理 12 株植物,我在两年内跟踪了总共 60 株植物.也就是说,我在第 1 年收集了这 60 株植物的测量值,并在第 2 年再次收集了相同的植物.
For my experiment, I clipped plants and measured their responses, such as leaf mass produced, at the end of the season. I manipulated both clipping intensity and clipping time and crossed these two treatments. I also included a control clipped treatment resulting in 5 different clipping treatment combinations. With 12 plants per treatment there is a total of 60 plants which I followed over the course of two years. That is, I collected measurements on these 60 plants in year 1 and the same plants again in year 2.
最简单的方法是分别分析 5 种不同的处理方法.然而,我想获得时间和强度的影响及其相互作用,但由于控制处理没有与时间或强度完全交叉,这使得我的实验设计不平衡且统计上棘手.更复杂的是,我想将年份的影响也包括到我的模型中.
It would be simplest to just analyze the 5 different treatments separately. However, I would like to obtain the effects of timing and intensity and their interactions, but because of the control treatment which is not fully crossed with either timing or intensity, this makes my experimental design unbalanced and statistically tricky. To complicate this a bit more, I would like to include the effect of year into my model as well.
理想情况下,我可以使用 lme4 来做到这一点,这使得使用 lsmeans 包之后的多重比较变得轻而易举.
Ideally, I would be able to do this using lme4 which makes multiple comparison a breeze afterwards with the lsmeans package.
当我尝试运行我的模型时
When I try to run my model
m1<-lmer(log(plant.leaf.g+1)~timing*intensity*year+(1|id), data=cmv) #not significant
我遇到了警告固定效应模型矩阵是秩不足的,因此删除了 8 列/系数".
I am met with the warning "fixed-effect model matrix is rank deficient so dropping 8 columns / coefficients".
有谁知道我可以让这个不平衡的混合模型与 lme4 一起工作的方法吗?
Does anyone know of a way I can get this unbalanced mixed model to work with lme4?
这是我的数据的一个子集,其中从不"在计时和零"在强度下任意替换了控制"处理:
Here is a subset of my data where "never" under timing and "zero" under intensity arbitrarily replaced "control" treatment:
id year timing intensity treatment plant.leaf.g
91 2015 early low early-low 315.944
92 2015 never zero control 99.28
93 2015 late high late-high 663.936
94 2015 early low early-low 25.488
95 2015 early high early-high 453.57
96 2015 late low late-low 90.804
97 2015 never zero control 1312.098
98 2015 late high late-high 959.82
99 2015 late low late-low 28.014
100 2015 late high late-high 178.56
91 2014 early low early-low 289.14
92 2014 never zero control 61.774
93 2014 late high late-high 639.936
94 2014 early low early-low 138.39
95 2014 early high early-high 168.216
96 2014 late low late-low 51.008
97 2014 never zero control 966.112
98 2014 late high late-high 279.048
99 2014 late low late-low 23.936
100 2014 late high late-high 169.344
cmv<-structure(list(id = c(91L, 92L, 93L, 94L, 95L, 96L, 97L, 98L,
99L, 100L, 101L, 102L, 103L, 105L, 106L, 107L, 108L, 109L, 110L,
91L, 92L, 93L, 94L, 95L, 96L, 97L, 98L, 99L, 100L, 101L, 102L,
103L, 104L, 105L, 106L, 107L, 108L, 109L, 110L), year = c(2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L), timing = structure(c(1L, 3L, 2L, 1L, 1L, 2L, 3L,
2L, 2L, 2L, 2L, 1L, 1L, 2L, 3L, 1L, 1L, 3L, 2L, 1L, 3L, 2L, 1L,
1L, 2L, 3L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 3L, 1L, 1L, 3L, 2L
), .Label = c("early", "late", "never"), class = "factor"), intensity = structure(c(2L,
3L, 1L, 2L, 1L, 2L, 3L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 3L, 2L, 1L,
3L, 1L, 2L, 3L, 1L, 2L, 1L, 2L, 3L, 1L, 2L, 1L, 2L, 1L, 1L, 2L,
2L, 3L, 2L, 1L, 3L, 1L), .Label = c("high", "low", "zero"), class = "factor"),
treatment = structure(c(3L, 1L, 4L, 3L, 2L, 5L, 1L, 4L, 5L,
4L, 5L, 2L, 2L, 5L, 1L, 3L, 2L, 1L, 4L, 3L, 1L, 4L, 3L, 2L,
5L, 1L, 4L, 5L, 4L, 5L, 2L, 2L, 5L, 5L, 1L, 3L, 2L, 1L, 4L
), .Label = c("control", "early-high", "early-low", "late-high",
"late-low"), class = "factor"), plant.stem.g = c(315.944,
99.28, 663.936, 25.488, 453.57, 90.804, 1312.098, 959.82,
28.014, 178.56, 158.12, 387.528, 288.75, 327.348, 770.44,
835.05, 457.188, 942.002, 229.194, 289.14, 61.774, 639.936,
138.39, 168.216, 51.008, 966.112, 279.048, 23.936, 169.344,
154.14, 703.04, 836.4, 511.92, 463.524, 245.226, 267.41,
439.392, 714.85, 68.012)), .Names = c("id", "year", "timing",
"intensity", "treatment", "plant.stem.g"), class = "data.frame", row.names = c(NA,
-39L))
注意:我已经让 m1=aov(plant.leaf.g~intensity*timing*year+Error(id), data=cmv)
运行,但我读到我应该使用car
包中的 Anova type="3" 函数来获取我的 p 值,但我无法使用 Error(id) 项来做到这一点.我也无法与 TukeyHSD
函数或 multcomp
包进行多重比较.
Note: I have gotten m1=aov(plant.leaf.g~intensity*timing*year+Error(id), data=cmv)
to run, but I read that I should use Anova type="3" function from the car
package to obtain my p-values, but I haven't been able to do this with the Error(id) term. Nor have I been able do a multiple comparison with TukeyHSD
function or multcomp
package.
推荐答案
m1<-lmer(log(plant.leaf.g+1)~timing*intensity*year+(1|id),
data=cmv)
(除了其中包含零的对数转换数据很棘手;您确定加 1 是正确的吗?只有在叶质量无单位时才有意义.您可以考虑添加 min(plant.leaf.g[plant.leaf.g>0])/2
代替...)
(except that log-transforming data with zeros in it is tricky; are you sure that adding 1 is correct? It only makes perfect sense if leaf mass is unitless. You might consider adding min(plant.leaf.g[plant.leaf.g>0])/2
instead ...)
出现警告(不是错误)是因为您的数据集中没有时间、强度和年份的所有组合,但您要求 R 估计每个组合的参数.一些合理的选择是:
The warning (not an error) occurs because you don't have all combinations of timing, intensity, and year in your data set, but you are asking R to estimate parameters for every combination. A few reasonable choices are:
- 忽略警告(在比较每个因素的整体影响时,您可能无论如何都会得到合理的答案)
- 降低模型的复杂性,特别是通过消除 3 向交互(即使用
(timing+intensity+year)^2
)(我假设这会起作用,但是您如果您的数据中缺少时间和强度的组合,则可能需要进一步简化模型) - 从三向交互构建单向方差分析,例如
cmv$int <- with(cmv,interaction(timing,intensity,year,drop=TRUE))
(但是你将无法分离主效应和交互)
- ignore the warning (you'll probably get reasonable answers anyway when comparing the overall effect of each factor)
- reduce the complexity of the model, in particular by eliminating the 3-way interaction (i.e. use
(timing+intensity+year)^2
) (I'm assuming this will work, but you might need to simplify the model still further if e.g. there are combinations of timing and intensity that are missing from your data) - construct a one-way ANOVA from the 3-way interaction, e.g.
cmv$int <- with(cmv,interaction(timing,intensity,year,drop=TRUE))
(but then you won't be able to separate main effects and interactions)
这篇关于R-用lme4分析重复测量不平衡设计?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!