本文介绍了子组比例之间的图差异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个随时间变化的数据框(调查数据),其中两组的结果( 0 1 )( T 控件和 T 进行处理),如下所示:

I have a dataframe (survey data) over time with an outcome of interest (0 or 1) for two groups (T for control and T for treatment), like the following:

set.seed(3546)
Data <- data.frame(
    date = sample((as.Date(as.Date("2011-12-30"):as.Date("2012-01-04"),
                           origin="1970-01-01")),
                   1000, replace = TRUE),
    treatment_group = sample(c("C", "T"), 1000, replace = TRUE),
    outcome = sample(c("1", "0"), 1000, replace = TRUE)
    )

为此,我绘制了两个组的比例,分别显示了各组的结果 1 ,这是通过以下代码完成的:

For this, I plot the proportion of the two groups showing the outcome 1 separately for the groups, which I do with the following code:

Data %>%
    mutate(treatment_group = factor(treatment_group, levels = c("T", "C")),
           date = as.POSIXct(date)) %>%
    group_by(treatment_group, date) %>%
    summarise(prop = sum(outcome=="1")/n()) %>%       #calculate proportion
ggplot() +
theme_classic() +
xlab("Date") +
ylab('Proportion outcome mentioned')+
scale_color_manual(values = c('C' = 'black', 'T' = 'darkgrey'),
                   labels = c('C' = 'Remaining sample',
                              'T' = 'Treated Group'),
                   name = "Legend") +
geom_smooth(aes(x = date, y = prop, color = treatment_group),
            se = F, method = 'loess') +
geom_point(aes(x = date, y = prop, color = treatment_group))

,我得到以下情节:

我想要的-但无法弄清楚-是一行显示每个时间点的值和各自的置信度之间的比例的 差异 间隔(用于比例差异的点估计),大致像这样(显然样式将保持不变-只是为了给您一个主意)

What I would like - but can't figure out how to - is one line showing the difference in proportion between the values for each time point and the respective confidence interval (for the point estimate of the difference in proportions), roughly like this (obviously the style will stay the same - just to give you an idea)

该行应指出在特定日期的结果 1 的比例之间的差异.在此先感谢您的帮助.:)

The line should indicate the difference between the proportions of outcome 1 on that particular day. Thanks a lot in advance for helping. :)

推荐答案

如果您没有对 prop 的不确定性进行任何度量,您如何期望计算CI?

How do you expect to calculate CIs if you don't have any measure of the uncertainty in prop?

此外,您可以通过以下方式重塑日期以绘制比例差异:

That aside, you can reshape the date in the following way to plot the difference of proportions:

Data %>%
    mutate(
        treatment_group = factor(treatment_group, levels = c("T", "C")),
        date = as.POSIXct(date)) %>% #convert date to date
    group_by(treatment_group, date) %>% #group
    summarise(
        prop = sum(outcome == "1") / n()) %>% #calculate proportion
    spread(treatment_group, prop) %>%
    mutate(propdiff = T - C) %>%
    ggplot(aes(date, propdiff)) +
    geom_line() +
    geom_point()

说明:根据 summary ,我们将数据从长到宽转换,然后将 propdiff 计算为 prop(T)-prop(C)

Explanation: Following summarise, we convert data from long to wide, and calculate propdiff as prop(T) - prop(C).

这篇关于子组比例之间的图差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-11 16:08