问题描述
我在R中有一个模型,该模型包括两个连续自变量IVContinuousA,IVContinuousB,IVCategorical和一个分类变量(分为两个级别:控制和治疗)之间的显着三向相互作用.因变量是连续(DV).model <- lm(DV ~ IVContinuousA * IVContinuousB * IVCategorical)
您可以在此处
找到数据我正在尝试找到一种在R中可视化它的方法,以简化我的解释(也许在ggplot2
中?).
受此博客文章我以为我可以将IVContinuousB
分为高值和低值(所以它本身将是一个两级因素:
IVContinuousBHigh <- mean(IVContinuousB) + sd (IVContinuousB)
IVContinuousBLow <- mean(IVContinuousB) - sd (IVContinuousB)
然后,我计划绘制DV和IV ContinuousA之间的关系,并绘制代表IVCategorical和我的新二分IVContinuousB的不同组合的这种关系的斜率的拟合线:
IVCategoricalControl
和IVContinuousBHigh
IVCategoricalControl
和IVContinuousBLow
IVCategoricalTreatment
和IVContinuousBHigh
IVCategoricalTreatment
和IVContinuousBLow
我的第一个问题是,这听起来像是一种可行的解决方案,可以产生这种可解释的三方互动情节吗?我想尽可能避免使用3D绘图,因为我不觉得它们直观...还是有另一种解决方法?也许是上面不同组合的刻面图?
如果这是一个好的解决方案,那么我的第二个问题是如何生成数据以预测拟合线以表示上述不同组合?
第三个问题-是否有人对如何在ggplot2中进行编码有任何建议?
我在Cross Validated上发布了一个非常类似的问题,但是因为它与代码相关性更高,所以我想我可以改在这里尝试(如果该简历与社区更相关,我将删除CV帖子:))
非常感谢
萨拉
请注意,DV列中有NA
个(空白),并且设计不平衡-变量IVCategorical的对照组"与治疗"组中的数据点数量略有不同.
仅供参考,我有将IVContinuousA与IVCategorical之间双向交互进行验证的代码:
A< -ggplot(data = data,aes(x = AOTAverage,y = SciconC,group = MisinfoCondition,shape = MisinfoCondition,col = MisinfoCondition,))+ geom_point(大小= 2)+ geom_smooth(方法='lm ',formula = y〜x)
但是我想要在IVContinuousB ...上绘制这种关系的条件.
以下是用于以二维方式可视化模型输出的几个选项.我在这里假设这里的目标是比较Treatment
和Control
library(tidyverse)
theme_set(theme_classic() +
theme(panel.background=element_rect(colour="grey40", fill=NA))
dat = read_excel("Some Data.xlsx") # I downloaded your data file
mod <- lm(DV ~ IVContinuousA * IVContinuousB * IVCategorical, data=dat)
# Function to create prediction grid data frame
make_pred_dat = function(data=dat, nA=20, nB=5) {
nCat = length(unique(data$IVCategorical))
d = with(data,
data.frame(IVContinuousA=rep(seq(min(IVContinuousA), max(IVContinuousA), length=nA), nB*2),
IVContinuousB=rep(rep(seq(min(IVContinuousB), max(IVContinuousB), length=nB), each=nA), nCat),
IVCategorical=rep(unique(IVCategorical), each=nA*nB)))
d$DV = predict(mod, newdata=d)
return(d)
}
IVContinuousA
与DV
的对比,按IVContinuousB
的级别 IVContinuousA
和IVContinuousB
的角色当然可以在这里切换.
ggplot(make_pred_dat(), aes(x=IVContinuousA, y=DV, colour=IVCategorical)) +
geom_line() +
facet_grid(. ~ round(IVContinuousB,2)) +
ggtitle("IVContinuousA vs. DV, by Level of IVContinousB") +
labs(colour="")
您可以绘制类似的图而无需多面化,但是随着IVContinuousB
级数的增加,它变得难以解释:
ggplot(make_pred_dat(nB=3),
aes(x=IVContinuousA, y=DV, colour=IVCategorical, linetype=factor(round(IVContinuousB,2)))) +
geom_line() +
#facet_grid(. ~ round(IVContinuousB,2)) +
ggtitle("IVContinuousA vs. DV, by Level of IVContinousB") +
labs(colour="", linetype="IVContinuousB") +
scale_linetype_manual(values=c("1434","11","62")) +
guides(linetype=guide_legend(reverse=TRUE))
模型预测差异的热图,DV处理-在IVContinuousA
和IVContinuousB
值的网格上进行DV控制
下面,我们查看在每对IVContinuousA
和IVContinuousB
处治疗和对照之间的区别.
ggplot(make_pred_dat(nA=100, nB=100) %>%
group_by(IVContinuousA, IVContinuousB) %>%
arrange(IVCategorical) %>%
summarise(DV = diff(DV)),
aes(x=IVContinuousA, y=IVContinuousB)) +
geom_tile(aes(fill=DV)) +
scale_fill_gradient2(low="red", mid="white", high="blue") +
labs(fill=expression(Delta*DV~(Treatment - Control)))
I have a model in R that includes a significant three-way interaction between two continuous independent variables IVContinuousA, IVContinuousB, IVCategorical and one categorical variable (with two levels: Control and Treatment). The dependent variable is continuous (DV).
model <- lm(DV ~ IVContinuousA * IVContinuousB * IVCategorical)
You can find the data here
I am trying to find out a way to visualise this in R to ease my interpretation of it (perhaps in ggplot2
?).
Somewhat inspired by this blog post I thought that I could dichotomise IVContinuousB
into high and low values (so it would be a two-level factor itself:
IVContinuousBHigh <- mean(IVContinuousB) + sd (IVContinuousB)
IVContinuousBLow <- mean(IVContinuousB) - sd (IVContinuousB)
I then planned to plot the relationship between DV and IV ContinuousA and fit lines representing the slopes of this relationship for different combinations of IVCategorical and my new dichotomised IVContinuousB:
IVCategoricalControl
and IVContinuousBHigh
IVCategoricalControl
and IVContinuousBLow
IVCategoricalTreatment
and IVContinuousBHigh
IVCategoricalTreatment
and IVContinuousBLow
My first question is does this sound like a viable solution to producing an interpretable plot of this three-way-interaction? I want to avoid 3D plots if possible as I don't find them intuitive... Or is there another way to go about it? Maybe facet plots for the different combinations above?
If it is an ok solution, my second question is how to I generate the data to predict the fit lines to represent the different combinations above?
Third question - does anyone have any advice as to how to code this up in ggplot2?
I posted a very similar question on Cross Validated but because it is more code related I thought I would try here instead (I will remove the CV post if this one is more relevant to the community :) )
Thanks so much in advance,
Sarah
Note that there are NA
s (left as blanks) in the DV column and the design is unbalanced - with slightly different numbers of datapoints in the Control vs Treatment groups of the variable IVCategorical.
FYI I have the code for visaualising a two-way interaction between IVContinuousA and IVCategorical:
A<-ggplot(data=data,aes(x=AOTAverage,y=SciconC,group=MisinfoCondition,shape=MisinfoCondition,col = MisinfoCondition,))+geom_point(size = 2)+geom_smooth(method='lm',formula=y~x)
But what I want is to plot this relationship conditional on IVContinuousB....
Here are a couple of options for visualizing the model output in two dimensions. I'm assuming here that the goal here is to compare Treatment
to Control
library(tidyverse)
theme_set(theme_classic() +
theme(panel.background=element_rect(colour="grey40", fill=NA))
dat = read_excel("Some Data.xlsx") # I downloaded your data file
mod <- lm(DV ~ IVContinuousA * IVContinuousB * IVCategorical, data=dat)
# Function to create prediction grid data frame
make_pred_dat = function(data=dat, nA=20, nB=5) {
nCat = length(unique(data$IVCategorical))
d = with(data,
data.frame(IVContinuousA=rep(seq(min(IVContinuousA), max(IVContinuousA), length=nA), nB*2),
IVContinuousB=rep(rep(seq(min(IVContinuousB), max(IVContinuousB), length=nB), each=nA), nCat),
IVCategorical=rep(unique(IVCategorical), each=nA*nB)))
d$DV = predict(mod, newdata=d)
return(d)
}
IVContinuousA
vs. DV
by levels of IVContinuousB
The roles of IVContinuousA
and IVContinuousB
can of course be switched here.
ggplot(make_pred_dat(), aes(x=IVContinuousA, y=DV, colour=IVCategorical)) +
geom_line() +
facet_grid(. ~ round(IVContinuousB,2)) +
ggtitle("IVContinuousA vs. DV, by Level of IVContinousB") +
labs(colour="")
You can make a similar plot without faceting, but it gets difficult to interpret as the number of IVContinuousB
levels increases:
ggplot(make_pred_dat(nB=3),
aes(x=IVContinuousA, y=DV, colour=IVCategorical, linetype=factor(round(IVContinuousB,2)))) +
geom_line() +
#facet_grid(. ~ round(IVContinuousB,2)) +
ggtitle("IVContinuousA vs. DV, by Level of IVContinousB") +
labs(colour="", linetype="IVContinuousB") +
scale_linetype_manual(values=c("1434","11","62")) +
guides(linetype=guide_legend(reverse=TRUE))
Heat map of the model-predicted difference, DV treatment - DV control on a grid of IVContinuousA
and IVContinuousB
values
Below, we look at the difference between treatment and control at each pair of IVContinuousA
and IVContinuousB
.
ggplot(make_pred_dat(nA=100, nB=100) %>%
group_by(IVContinuousA, IVContinuousB) %>%
arrange(IVCategorical) %>%
summarise(DV = diff(DV)),
aes(x=IVContinuousA, y=IVContinuousB)) +
geom_tile(aes(fill=DV)) +
scale_fill_gradient2(low="red", mid="white", high="blue") +
labs(fill=expression(Delta*DV~(Treatment - Control)))
这篇关于可视化R中两个连续变量和一个分类变量之间的三向交互的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!