用dplyr计算95％-CI的长度

本文介绍了用dplyr计算95％-CI的长度的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述我最后一次询问如何计算每个测量场合（一周）的变量（procras）的平均得分，这个变量对于多个受访者反复测量。所以我的（简化的）长格式数据集看起来像下面这样（这里有两个学生，5个时间点，没有分组变量）： studentID week procras 1 0 1.4 1 6 1.2 1 16 1.6 1 28 NA 1 40 3.8 2 0 1.4 2 6 1.8 2 16 2.0 2 28 2.5 2 40 2.8 使用dplyr我会得到每个度量场合的平均分数 mean_data 例如：来源：local data frame [5 x 2] occ procras （dbl ）（dbl） 1 0 1.993141 2 6 2.124020 3 16 2.251548 4 28 2.469658 5 40 2.617903 使用ggplot2我现在可以绘制随时间的平均变化，并且通过轻松调整dplyr的group_data（），我也可以获得每个子组的意味着例如，男性和女性每次平均得分）。现在我想在mean_data表中添加一列，其中包括95％-CIs每个场合平均得分的长度。 http://www.cookbook-r.com/Graphs/Plotting_means_and_error_bars_（ggplot2）/ 解释了如何获取和绘制配置项，但是，只要我想为任何子群执行此操作，这种方法似乎就会出现问题，对吗？那么有没有办法让dplyr自动在mean_data中包含CI（基于组的大小等）？之后，应该相当容易地将新值作为CI映射到我希望的图中。谢谢。解决方案您可以使用 mutate 在中总结一些额外的函数 library（dplyr） mtcars％>％ group_by（vs）％>％汇总（mean.mpg =平均值（mpg，na.rm = TRUE）， sd.mpg = sd （mpg，na.rm = TRUE）， n.mpg = n（））％>％ mutate（se.mpg = sd.mpg / sqrt（n.mpg）， lower.ci.mpg = mean.mpg-qt（1-（0.05 / 2），n.mpg-1）* se.mpg， upper.ci.mpg = mean.mpg + qt（1 - （0.05 / 2），n.mpg - 1）* se.mpg）＃>来源：本地数据框[2 x 7] ＃> ＃> vs mean.mpg sd.mpg n.mpg se.mpg lower.ci.mpg upper.ci.mpg ＃> （dbl）（dbl）（dbl）（int）（dbl）（dbl）（dbl）＃> 1 0 16.61667 3.860699 18 0.9099756 14.69679 18.53655 ＃> 2 1 24.55714 5.378978 14 1.4375924 21.45141 27.66287 Last time I asked how it was possible to calculate the average score per measurement occasion (week) for a variable (procras) that has been measured repeatedly for multiple respondents. So my (simplified) dataset in long format looks for example like the following (here two students, and 5 time points, no grouping variable):studentID week procras 1 0 1.4 1 6 1.2 1 16 1.6 1 28 NA 1 40 3.8 2 0 1.4 2 6 1.8 2 16 2.0 2 28 2.5 2 40 2.8Using dplyr I would get the average score per measurement occasionmean_data <- group_by(DataRlong, week)%>% summarise(procras = mean(procras, na.rm = TRUE))Looking like this e.g.:Source: local data frame [5 x 2] occ procras (dbl) (dbl) 1 0 1.993141 2 6 2.124020 3 16 2.251548 4 28 2.469658 5 40 2.617903With ggplot2 I could now plot the average change over time, and by easily adjusting the group_data() of dplyr I could also get means per sub groups (for instance, the average score per occasion for men and women).Now I would like to add a column to the mean_data table which includes the length for the 95%-CIs around the average score per occasion.http://www.cookbook-r.com/Graphs/Plotting_means_and_error_bars_(ggplot2)/ explains how to get and plot CIs, but this approach seems to become problematic as soon as I wanted to do this for any subgroup, right? So is there a way to let dplyr also include the CI (based on group size, ect.) automatically in the mean_data?After that it should be fairly easy to plot the new values as CIs into the graphs I hope.Thank you. 解决方案 You could do it manually using mutate a few extra functions in summariselibrary(dplyr)mtcars %>% group_by(vs) %>% summarise(mean.mpg = mean(mpg, na.rm = TRUE), sd.mpg = sd(mpg, na.rm = TRUE), n.mpg = n()) %>% mutate(se.mpg = sd.mpg / sqrt(n.mpg), lower.ci.mpg = mean.mpg - qt(1 - (0.05 / 2), n.mpg - 1) * se.mpg, upper.ci.mpg = mean.mpg + qt(1 - (0.05 / 2), n.mpg - 1) * se.mpg)#> Source: local data frame [2 x 7]#> #> vs mean.mpg sd.mpg n.mpg se.mpg lower.ci.mpg upper.ci.mpg#> (dbl) (dbl) (dbl) (int) (dbl) (dbl) (dbl)#> 1 0 16.61667 3.860699 18 0.9099756 14.69679 18.53655#> 2 1 24.55714 5.378978 14 1.4375924 21.45141 27.66287 这篇关于用dplyr计算95％-CI的长度的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！