我们正在为生物学学生教授统计类(class),并尝试使用 R 作为计算和数据可视化平台。尽可能地,我们希望避免在 R 中使用额外的包和做任何非常“花哨”的事情;本类(class)的重点是统计,而不是编程。尽管如此,我们还没有找到一种很好的方法来在 R 中为两因素方差分析设计生成误差条图。我们正在使用 ggplot2 包来制作绘图,虽然它有一个内置的 stat_summary 方法来生成 95% CI 误差条,但这些计算方法可能并不总是正确的方法。下面,我手动查看方差分析的代码并手动计算 95% 置信区间(根据总残差方差估计标准误差,而不仅仅是 ggplot 的汇总方法将使用的组内方差)。最后,其实是有剧情的。
所以问题是......是否有更简单/更快/更简单的方法来完成所有这些?
# LIZARD LENGTH DATA
island.1 <- c(0.2, 5.9, 6.1, 6.5)
island.2 <- c(5.6, 14.8, 15.5, 16.4)
island.3 <- c(0.8, 3.9, 4.3, 4.9)
sex.codes <- c("Male", "Female", "Male", "Female")
# PUTTING DATA TOGETHER IN A DATA FRAME
df.1 <- data.frame(island.1, island.2, island.3, sex.codes)
# MELTING THE DATA FRAME INTO LONG FORM
library(reshape)
df.2 <- melt(df.1)
# MEAN BY CELL
mean.island1.male <- with(df.2, mean(value[variable == "island.1" & sex.codes == "Male"]))
mean.island1.female <- with(df.2, mean(value[variable == "island.1" & sex.codes == "Female"]))
mean.island2.male <- with(df.2, mean(value[variable == "island.2" & sex.codes == "Male"]))
mean.island2.female <- with(df.2, mean(value[variable == "island.2" & sex.codes == "Female"]))
mean.island3.male <- with(df.2, mean(value[variable == "island.3" & sex.codes == "Male"]))
mean.island3.female <- with(df.2, mean(value[variable == "island.3" & sex.codes == "Female"]))
# ADDING CELL MEANS TO DATA FRAME
df.2$means[df.2$variable == "island.1" & df.2$sex.codes == "Male"] <- mean.island1.male
df.2$means[df.2$variable == "island.1" & df.2$sex.codes == "Female"] <- mean.island1.female
df.2$means[df.2$variable == "island.2" & df.2$sex.codes == "Male"] <- mean.island2.male
df.2$means[df.2$variable == "island.2" & df.2$sex.codes == "Female"] <- mean.island2.female
df.2$means[df.2$variable == "island.3" & df.2$sex.codes == "Male"] <- mean.island3.male
df.2$means[df.2$variable == "island.3" & df.2$sex.codes == "Female"] <- mean.island3.female
# LINEAR MODEL
lizard.model <- lm(value ~ variable*sex.codes, data=df.2)
# CALCULATING RESIDUALS BY HAND:
df.2$residuals.1 <- df.2$value - df.2$means
# CONFIRMING RESIDUALS FROM LINEAR MODEL:
df.2$residuals.2 <- residuals(lizard.model)
# TWO FACTOR MAIN EFFECT ANOVA
lizard.anova <- anova(lizard.model)
# INTERACTION PLOT
interaction.plot(df.2$variable, df.2$sex.codes, df.2$value)
# SAMPLE SIZE IN EACH CELL
n <- length(df.2$value[df.2$variable == "island.1" & df.2$sex.codes == "Male"])
# > n
# [1] 2
# NOTE: JUST FOR CLARITY, PRETEND n=10
n <- 10
# CALCULATING STANDARD ERROR
island.se <- sqrt(lizard.anova$M[4]/n)
# HALF CONFIDENCE INTERVAL
island.ci.half <- qt(0.95, lizard.anova$D[4]) * island.se
# MAKING SUMMARY DATA FRAME
summary.df <- data.frame(
Means = c(mean.island1.male,
mean.island1.female,
mean.island2.male,
mean.island2.female,
mean.island3.male,
mean.island3.female),
Location = c("island1",
"island1",
"island2",
"island2",
"island3",
"island3"),
Sex = c("male",
"female",
"male",
"female",
"male",
"female"),
CI.half = rep(island.ci.half, 6)
)
# > summary.df
# Means Location Sex CI.half
# 1 3.15 island1 male 2.165215
# 2 6.20 island1 female 2.165215
# 3 10.55 island2 male 2.165215
# 4 15.60 island2 female 2.165215
# 5 2.55 island3 male 2.165215
# 6 4.40 island3 female 2.165215
# GENERATING THE ERRORBAR PLOT
library(ggplot2)
qplot(data=summary.df,
y=Means,
x=Location,
group=Sex,
ymin=Means-CI.half,
ymax=Means+CI.half,
geom=c("point", "errorbar", "line"),
color=Sex,
shape=Sex,
width=0.25) + theme_bw()
最佳答案
这是使用 sciplot 包的另一种尝试。可以在参数 ci.fun 中传递计算置信区间的替代方法。
lineplot.CI(variable,value, group =sex.codes , data = df.2, cex = 1.5,
xlab = "Location", ylab = "means", cex.lab = 1.2, x.leg = 1,
col = c("blue","red"), pch = c(16,16))
关于r - R中的两因素方差分析误差条图,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/7977742/