本文介绍了用dplyr计算95%-CI的长度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述 我最后一次询问如何计算每个测量场合(一周)的变量(procras)的平均得分,这个变量对于多个受访者反复测量。所以我的(简化的)长格式数据集看起来像下面这样(这里有两个学生,5个时间点,没有分组变量): studentID week procras 1 0 1.4 1 6 1.2 1 16 1.6 1 28 NA 1 40 3.8 2 0 1.4 2 6 1.8 2 16 2.0 2 28 2.5 2 40 2.8 使用dplyr我会得到每个度量场合的平均分数 mean_data 例如: 来源:local data frame [5 x 2] occ procras (dbl )(dbl) 1 0 1.993141 2 6 2.124020 3 16 2.251548 4 28 2.469658 5 40 2.617903 使用ggplot2我现在可以绘制随时间的平均变化,并且通过轻松调整dplyr的group_data(),我也可以获得每个子组的意味着例如,男性和女性每次平均得分)。 现在我想在mean_data表中添加一列,其中包括95%-CIs每个场合平均得分的长度。 http://www.cookbook-r.com/Graphs/Plotting_means_and_error_bars_(ggplot2)/ 解释了如何获取和绘制配置项,但是,只要我想为任何子群执行此操作,这种方法似乎就会出现问题,对吗?那么有没有办法让dplyr自动在mean_data中包含CI(基于组的大小等)? 之后,应该相当容易地将新值作为CI映射到我希望的图中。 谢谢。解决方案您可以使用 mutate 在中总结一些额外的函数 library(dplyr) mtcars%>% group_by(vs)%>%汇总(mean.mpg =平均值(mpg,na.rm = TRUE), sd.mpg = sd (mpg,na.rm = TRUE), n.mpg = n())%>% mutate(se.mpg = sd.mpg / sqrt(n.mpg), lower.ci.mpg = mean.mpg-qt(1-(0.05 / 2),n.mpg-1)* se.mpg, upper.ci.mpg = mean.mpg + qt(1 - (0.05 / 2),n.mpg - 1)* se.mpg) #>来源:本地数据框[2 x 7] #> #> vs mean.mpg sd.mpg n.mpg se.mpg lower.ci.mpg upper.ci.mpg #> (dbl)(dbl)(dbl)(int)(dbl)(dbl)(dbl)#> 1 0 16.61667 3.860699 18 0.9099756 14.69679 18.53655 #> 2 1 24.55714 5.378978 14 1.4375924 21.45141 27.66287 Last time I asked how it was possible to calculate the average score per measurement occasion (week) for a variable (procras) that has been measured repeatedly for multiple respondents. So my (simplified) dataset in long format looks for example like the following (here two students, and 5 time points, no grouping variable):studentID week procras 1 0 1.4 1 6 1.2 1 16 1.6 1 28 NA 1 40 3.8 2 0 1.4 2 6 1.8 2 16 2.0 2 28 2.5 2 40 2.8Using dplyr I would get the average score per measurement occasionmean_data <- group_by(DataRlong, week)%>% summarise(procras = mean(procras, na.rm = TRUE))Looking like this e.g.:Source: local data frame [5 x 2] occ procras (dbl) (dbl) 1 0 1.993141 2 6 2.124020 3 16 2.251548 4 28 2.469658 5 40 2.617903With ggplot2 I could now plot the average change over time, and by easily adjusting the group_data() of dplyr I could also get means per sub groups (for instance, the average score per occasion for men and women).Now I would like to add a column to the mean_data table which includes the length for the 95%-CIs around the average score per occasion.http://www.cookbook-r.com/Graphs/Plotting_means_and_error_bars_(ggplot2)/ explains how to get and plot CIs, but this approach seems to become problematic as soon as I wanted to do this for any subgroup, right? So is there a way to let dplyr also include the CI (based on group size, ect.) automatically in the mean_data?After that it should be fairly easy to plot the new values as CIs into the graphs I hope.Thank you. 解决方案 You could do it manually using mutate a few extra functions in summariselibrary(dplyr)mtcars %>% group_by(vs) %>% summarise(mean.mpg = mean(mpg, na.rm = TRUE), sd.mpg = sd(mpg, na.rm = TRUE), n.mpg = n()) %>% mutate(se.mpg = sd.mpg / sqrt(n.mpg), lower.ci.mpg = mean.mpg - qt(1 - (0.05 / 2), n.mpg - 1) * se.mpg, upper.ci.mpg = mean.mpg + qt(1 - (0.05 / 2), n.mpg - 1) * se.mpg)#> Source: local data frame [2 x 7]#> #> vs mean.mpg sd.mpg n.mpg se.mpg lower.ci.mpg upper.ci.mpg#> (dbl) (dbl) (dbl) (int) (dbl) (dbl) (dbl)#> 1 0 16.61667 3.860699 18 0.9099756 14.69679 18.53655#> 2 1 24.55714 5.378978 14 1.4375924 21.45141 27.66287 这篇关于用dplyr计算95%-CI的长度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!
10-10 13:50