I am trying to run a two-way ANOVA on multiple subsets of a data frame without having to actually subset the data as this is in-efficient
DF<-structure(list(Sample = c(666L, 676L, 686L, 667L, 677L, 687L,
822L, 832L, 842L, 824L, 834L, 844L), Time = c(300L, 300L, 300L,
300L, 300L, 300L, 400L, 400L, 400L, 400L, 400L, 400L), Ploidy = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("2n",
"3n"), class = "factor"), Tissue = c("muscle", "muscle", "muscle",
"liver", "liver", "liver", "intestine", "intestine", "intestine",
"gill", "gill", "gill"), X.lipid = c(1.1, 0.8, 1.3, 3.7, 3.9,
3.8, 5.2, 3.4, 6, 7.6, 10.4, 6.7), l.dec = c(0.011, 0.008, 0.013,
0.037, 0.039, 0.038, 0.052, 0.034, 0.06, 0.076, 0.104, 0.067),
l.arc = c(0.105074124512229, 0.0895624074394449, 0.114266036973812,
0.193560218793138, 0.19879088899975, 0.196192082631721, 0.230059118691331,
0.185452088760136, 0.247467063170448, 0.279298057669285,
0.328359182374352, 0.261824790465914)), .Names = c("Sample",
"Time", "Ploidy", "Tissue", "X.lipid", "l.dec", "l.arc"), row.names = c(1L,
2L, 3L, 4L, 5L, 6L, 69L, 70L, 71L, 72L, 73L, 74L), class = "data.frame")
遇到类似的例子:Anova, for loop to apply function 和多个响应的方差分析,按多个组不部分公式的
我可以接近,但我不相信这是正确的,因为它使用的是 aov,而不是 anova
I can get close but I do not believe this is correct as it uses aov, as opposed to anova
x<- unique(DF$Tissue)
sapply(x, function(my) {
f <- as.formula(paste("l.dec~Time*Ploidy"))
aov(f, data=DF)
}, simplify=FALSE)
如果我将 aov 切换为 anova,它会返回一条错误消息:
If i switch aov for anova, it returns an error message:
Error in UseMethod("anova") :
no applicable method for 'anova' applied to an object of class "formula"
Long way around but which is CORRECT is as follows:
#Subset by each Tissue type (just one here for e.g.)
muscle<- subset (DF, Tissue == "muscle")
#Perform Anova
anova(lm(l.dec ~ Ploidy * Time, data = muscle))
However In the main data frame I have many tissue types and want to avoid performing this subset.
I believe the apply formula is close but need help on the final stages.
基于@user20650 和我上面的评论,我建议首先使用 sapply
和 lm
来生成您的模型列表,然后在该列表上再次使用 sapply
Building on @user20650 and my comments above, I would suggest first using sapply
with lm
to generate your list of models, and then use sapply
again on that list to generate your ANOVA tables. That way the list of models will be available to you so you can get coefficients, fitted values, residuals etc etc.
x <- unique(DF$Tissue)
models <- sapply(x, function(my) {
lm(l.dec ~ Time * Ploidy, data=DF, Tissue==my)
}, simplify=FALSE)
ANOVA.tables <- sapply(models, anova, simplify=FALSE)
这篇关于在 R 中的多个子集上正确使用 sapply 和 Anova的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!