问题描述
此帖子与.
这里有xy
个分组数据,其中y
是分数:
Here I have xy
grouped data where y
are fractions:
library(dplyr)
library(ggplot2)
library(ggpmisc)
set.seed(1)
df1 <- data.frame(value = c(0.8,0.5,0.4,0.2,0.5,0.6,0.5,0.48,0.52),
age = rep(c("d2","d4","d45"),3),
group = c("A","A","A","B","B","B","C","C","C")) %>%
dplyr::mutate(time = as.integer(age)) %>%
dplyr::arrange(group,time) %>%
dplyr::mutate(group_age=paste0(group,"_",age))
df1$group_age <- factor(df1$group_age,levels=unique(df1$group_age))
我要实现的是将df1
绘制为条形图,如下所示:
What I'm trying to achieve is to plot df1
as a bar plot, like this:
ggplot(df1,aes(x=group_age,y=value,fill=age)) +
geom_bar(stat='identity')
但是我想将每个group
一个binomial glm
与一个logit link function
匹配,以估计time
对这些分数的影响.
But I want to fit to each group
a binomial glm
with a logit link function
, which estimates how these fractions are affected by time
.
比方说,每个group
中的每个age
(time
)都有100个观测值:
Let's say I have 100 observations per each age
(time
) in each group
:
df2 <- do.call(rbind,lapply(1:nrow(df1),function(i){
data.frame(age=df1$age[i],group=df1$group[i],time=df1$time[i],group_age=df1$group_age[i],value=c(rep(T,100*df1$value[i]),rep(F,100*(1-df1$value[i]))))
}))
然后每个group
(例如group
A
)的glm
是:
Then the glm
for each group
(e.g., group
A
) is:
glm(value ~ time, dplyr::filter(df2, group == "A"), family = binomial(link='logit'))
所以我想将每个group
的估计regression
slopes
以及它们对应的p-value
s添加到图上(类似于我在此发布).
So I would like to add to the plot above the estimated regression
slopes
for each group
along with their corresponding p-value
s (similar to what I'm doing for the continuous df$value
in this post).
我认为使用:
ggplot(df1,aes(x=group_age,y=value,fill=age)) +
geom_bar(stat='identity') +
geom_smooth(data=df2,mapping=aes(x=group_age,y=value,group=group),color="black",method='glm',method.args=list(family=binomial(link='logit')),size=1,se=T) +
stat_poly_eq(aes(label=stat(p.value.label)),formula=my_formula,parse=T,npcx="center",npcy="bottom") +
scale_x_log10(name="Age",labels=levels(df$age),breaks=1:length(levels(df$age))) +
facet_wrap(~group) + theme_minimal()
可以工作,但出现错误:
Would work but I get the error:
Error in Math.factor(x, base) : ‘log’ not meaningful for factors
有什么办法做对了吗?
推荐答案
我相信这会有所帮助:
library(tidyverse)
library(broom)
df2$value <- as.numeric(df2$value)
#Estimate coefs
dfmodel <- df2 %>% group_by(group) %>%
do(fitmodel = glm(value ~ time, data = .,family = binomial(link='logit')))
#Extract coeffs
dfCoef = tidy(dfmodel, fitmodel)
#Create labels
dfCoef %>% filter(term=='(Intercept)') %>% mutate(Label=paste0(round(estimate,3),'(p=',round(p.value,3),')'),
group_age=paste0(group,'_','d4')) %>%
select(c(group,Label,group_age)) -> Labels
#Values
df2 %>% group_by(group,group_age) %>% summarise(value=sum(value)) %>% ungroup() %>%
group_by(group) %>% filter(value==max(value)) %>% select(-group_age) -> values
#Combine
Labels %>% left_join(values) -> Labels
Labels %>% mutate(age=NA) -> Labels
#Plot
ggplot(df2,aes(x=group_age,y=value,fill=age)) +
geom_text(data=Labels,aes(x=group_age,y=value,label=Label),fontface='bold')+
geom_bar(stat='identity')+
facet_wrap(.~group,scales='free')
这篇关于使用geom_smooth将glm拟合到分数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!