本文介绍了群组成员之间的差异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个实验的不同处理方法,如下所示:

  set.seed(1 )
df< - data.frame(treatment = rep(c('baseline','treatment 1','treatment 2'),
times = 5),
round = rep (1:5,each = 3),
measurement1 = rep(1:5,each = 3)+ rnorm(15),
measurement2 = rep(1:5,each = 3)+ rnorm (15))

df

#治疗圆测量1测量2
#1基准1 0.3735462 0.9550664
#2治疗1 1 1.1836433 0.9838097
#3治疗2 1 0.1643714 1.9438362
#4基线2 3.5952808 2.8212212
#5治疗1 2 2.3295078 2.5939013
#6治疗2 2 1.1795316 2.9189774
#7基线3 3.4874291 3.7821363
#8治疗1 3 3.7383247 3.0745650
#9治疗2 3 3.5757814 1.0106483
#10基准4 3.6946116 4.6198257
#11治疗1 4 5.5117812 3.9438713
#12治疗2 4 4.3898432 3.8442045
#13基线5 4.3787594 3.5292476
# 14治疗1 5 2.7853001 4.5218499
#15治疗2 5 6.1249309 5.4179416

我想要什么是一个 data.frame ,其中包含每个处理之间的两次测量与每轮的基线之间的差异。也就是说,按圆组合,我想在基线 / code>从每个两个测量值中减去。



我更喜欢一个 dplyr 解决方案

解决方案

您可以使用 mutate_each 为此:

  mydf%>%
group_by(round)%>%
mutate_each(funs(。 - 。[treatment ==baseline]),-treatment)%>%
filter(treatment!=baseline)
/ pre>

其中:

 源:本地数据框[ 10 x 4] 
组:圆[5]

治疗圆测量1测量2
(fctr)(int)(dbl)(dbl)
1 treat1 1 1.558820 -0.6584485
2 treat2 1 -0.068677 1.3364462
3 treatment1 2 1.769312 -0.2732490
4 treatment2 2 0.801357 -1.4852449
5 treatment1 3 -1.064394 -1.1513703
6 treat2 3 2.433222 -0.7939903
7 treatment1 4 0.448744 0.1394982
8 treat2 4 -1.066922 -1.1410085
9 treatment1 5 1.182761 -0.8311095
10 treat2 5 0.138005 0.2622119






如果您要将差异添加到数据框中(就像在他的 dplyr / tidyr 替代方案中一样),您还可以执行以下操作:

  mydf%>%
group_by(round)%>%
mutate(diff1 = measurement1 - measurement1 [ ==baseline],
diff2 = measurement2 - measurement2 [treatment ==baseline])%>%
过滤器(treatment!=baseline)

其中:

 本地数据表[10 x 6] 

处理轮测量1测量2 diff1 diff2
(fctr)(int) (dbl)(dbl)(dbl)
1 treatment1 1 2.630392 -0.104258 1.558820 -0.6584485
2 treatment2 1 1.002895 1.890637 -0.068677 1.3364462
3 treatment1 2 3.822473 3.147443 1.769312 -0.2732490
4 treatment2 2 2.854518 1.935447 0.801357 -1.4852449
5 treatment1 3 1.520553 3.291122 -1.064394 -1.1513703
6 treat2 3 5.018169 3.648502 2.433222 -0.7939903
7 treatment1 4 4.956380 4.544908 0.448744 0.1394982
8 treatment2 4 3.440714 3.264401 -1.066922 -1.1410085
9 treatment1 5 4.672056 5.082310 1.182761 -0.8311095
10 treat2 5 3.627300 6.175631 0.138005 0.2622119


I have measurements for different treatments of an experiment that ran over several rounds, like so:

set.seed(1)
df <- data.frame(treatment = rep(c('baseline', 'treatment 1', 'treatment 2'),
                                 times=5),
                 round = rep(1:5, each=3),
                 measurement1 = rep(1:5, each=3) + rnorm(15),
                 measurement2 = rep(1:5, each=3) + rnorm(15))

df

#      treatment round measurement1 measurement2
# 1     baseline     1    0.3735462    0.9550664
# 2  treatment 1     1    1.1836433    0.9838097
# 3  treatment 2     1    0.1643714    1.9438362
# 4     baseline     2    3.5952808    2.8212212
# 5  treatment 1     2    2.3295078    2.5939013
# 6  treatment 2     2    1.1795316    2.9189774
# 7     baseline     3    3.4874291    3.7821363
# 8  treatment 1     3    3.7383247    3.0745650
# 9  treatment 2     3    3.5757814    1.0106483
# 10    baseline     4    3.6946116    4.6198257
# 11 treatment 1     4    5.5117812    3.9438713
# 12 treatment 2     4    4.3898432    3.8442045
# 13    baseline     5    4.3787594    3.5292476
# 14 treatment 1     5    2.7853001    4.5218499
# 15 treatment 2     5    6.1249309    5.4179416

What I would like is a data.frame that contains the differences in the two measurements between each of the treatments and the baseline for each round. That is, grouped by round, I would like the respective measurement in the baseline treatment subtracted from each of the two measurements.

I'd prefer a dplyr solution if one exists but will accept anything that borders on elegant.

解决方案

You can use mutate_each for that:

mydf %>%
  group_by(round) %>%
  mutate_each(funs(. - .[treatment=="baseline"]), -treatment) %>%
  filter(treatment!="baseline")

which gives:

Source: local data frame [10 x 4]
Groups: round [5]

    treatment round measurement1 measurement2
       (fctr) (int)        (dbl)        (dbl)
1  treatment1     1     1.558820   -0.6584485
2  treatment2     1    -0.068677    1.3364462
3  treatment1     2     1.769312   -0.2732490
4  treatment2     2     0.801357   -1.4852449
5  treatment1     3    -1.064394   -1.1513703
6  treatment2     3     2.433222   -0.7939903
7  treatment1     4     0.448744    0.1394982
8  treatment2     4    -1.066922   -1.1410085
9  treatment1     5     1.182761   -0.8311095
10 treatment2     5     0.138005    0.2622119


If you want to add the differences to your dataframe (just as @akrun did in his dplyr / tidyr alternative), you could also do:

mydf %>%
  group_by(round) %>%
  mutate(diff1 = measurement1 - measurement1[treatment=="baseline"],
         diff2 = measurement2 - measurement2[treatment=="baseline"]) %>%
  filter(treatment!="baseline")

which gives:

Source: local data table [10 x 6]

    treatment round measurement1 measurement2     diff1      diff2
       (fctr) (int)        (dbl)        (dbl)     (dbl)      (dbl)
1  treatment1     1     2.630392    -0.104258  1.558820 -0.6584485
2  treatment2     1     1.002895     1.890637 -0.068677  1.3364462
3  treatment1     2     3.822473     3.147443  1.769312 -0.2732490
4  treatment2     2     2.854518     1.935447  0.801357 -1.4852449
5  treatment1     3     1.520553     3.291122 -1.064394 -1.1513703
6  treatment2     3     5.018169     3.648502  2.433222 -0.7939903
7  treatment1     4     4.956380     4.544908  0.448744  0.1394982
8  treatment2     4     3.440714     3.264401 -1.066922 -1.1410085
9  treatment1     5     4.672056     5.082310  1.182761 -0.8311095
10 treatment2     5     3.627300     6.175631  0.138005  0.2622119

这篇关于群组成员之间的差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-31 15:12