问题描述
我有一个实验的不同处理方法,如下所示:
set.seed(1 )
df< - data.frame(treatment = rep(c('baseline','treatment 1','treatment 2'),
times = 5),
round = rep (1:5,each = 3),
measurement1 = rep(1:5,each = 3)+ rnorm(15),
measurement2 = rep(1:5,each = 3)+ rnorm (15))
df
#治疗圆测量1测量2
#1基准1 0.3735462 0.9550664
#2治疗1 1 1.1836433 0.9838097
#3治疗2 1 0.1643714 1.9438362
#4基线2 3.5952808 2.8212212
#5治疗1 2 2.3295078 2.5939013
#6治疗2 2 1.1795316 2.9189774
#7基线3 3.4874291 3.7821363
#8治疗1 3 3.7383247 3.0745650
#9治疗2 3 3.5757814 1.0106483
#10基准4 3.6946116 4.6198257
#11治疗1 4 5.5117812 3.9438713
#12治疗2 4 4.3898432 3.8442045
#13基线5 4.3787594 3.5292476
# 14治疗1 5 2.7853001 4.5218499
#15治疗2 5 6.1249309 5.4179416
我想要什么是一个 data.frame
,其中包含每个处理之间的两次测量与每轮的基线之间的差异。也就是说,按圆组合
,我想在基线
/ code>从每个两个测量值中减去。
我更喜欢一个 dplyr
解决方案
您可以使用 mutate_each
为此:
mydf%>%
/ pre>
group_by(round)%>%
mutate_each(funs(。 - 。[treatment ==baseline]),-treatment)%>%
filter(treatment!=baseline)
其中:
源:本地数据框[ 10 x 4]
组:圆[5]
治疗圆测量1测量2
(fctr)(int)(dbl)(dbl)
1 treat1 1 1.558820 -0.6584485
2 treat2 1 -0.068677 1.3364462
3 treatment1 2 1.769312 -0.2732490
4 treatment2 2 0.801357 -1.4852449
5 treatment1 3 -1.064394 -1.1513703
6 treat2 3 2.433222 -0.7939903
7 treatment1 4 0.448744 0.1394982
8 treat2 4 -1.066922 -1.1410085
9 treatment1 5 1.182761 -0.8311095
10 treat2 5 0.138005 0.2622119
如果您要将差异添加到数据框中(就像在他的 dplyr / tidyr 替代方案中一样),您还可以执行以下操作:
mydf%>%
group_by(round)%>%
mutate(diff1 = measurement1 - measurement1 [ ==baseline],
diff2 = measurement2 - measurement2 [treatment ==baseline])%>%
过滤器(treatment!=baseline)
其中:
本地数据表[10 x 6]
处理轮测量1测量2 diff1 diff2
(fctr)(int) (dbl)(dbl)(dbl)
1 treatment1 1 2.630392 -0.104258 1.558820 -0.6584485
2 treatment2 1 1.002895 1.890637 -0.068677 1.3364462
3 treatment1 2 3.822473 3.147443 1.769312 -0.2732490
4 treatment2 2 2.854518 1.935447 0.801357 -1.4852449
5 treatment1 3 1.520553 3.291122 -1.064394 -1.1513703
6 treat2 3 5.018169 3.648502 2.433222 -0.7939903
7 treatment1 4 4.956380 4.544908 0.448744 0.1394982
8 treatment2 4 3.440714 3.264401 -1.066922 -1.1410085
9 treatment1 5 4.672056 5.082310 1.182761 -0.8311095
10 treat2 5 3.627300 6.175631 0.138005 0.2622119
I have measurements for different treatments of an experiment that ran over several rounds, like so:
set.seed(1) df <- data.frame(treatment = rep(c('baseline', 'treatment 1', 'treatment 2'), times=5), round = rep(1:5, each=3), measurement1 = rep(1:5, each=3) + rnorm(15), measurement2 = rep(1:5, each=3) + rnorm(15)) df # treatment round measurement1 measurement2 # 1 baseline 1 0.3735462 0.9550664 # 2 treatment 1 1 1.1836433 0.9838097 # 3 treatment 2 1 0.1643714 1.9438362 # 4 baseline 2 3.5952808 2.8212212 # 5 treatment 1 2 2.3295078 2.5939013 # 6 treatment 2 2 1.1795316 2.9189774 # 7 baseline 3 3.4874291 3.7821363 # 8 treatment 1 3 3.7383247 3.0745650 # 9 treatment 2 3 3.5757814 1.0106483 # 10 baseline 4 3.6946116 4.6198257 # 11 treatment 1 4 5.5117812 3.9438713 # 12 treatment 2 4 4.3898432 3.8442045 # 13 baseline 5 4.3787594 3.5292476 # 14 treatment 1 5 2.7853001 4.5218499 # 15 treatment 2 5 6.1249309 5.4179416
What I would like is a
data.frame
that contains the differences in the two measurements between each of the treatments and the baseline for each round. That is, grouped byround
, I would like the respective measurement in thebaseline
treatment
subtracted from each of the two measurements.I'd prefer a
dplyr
solution if one exists but will accept anything that borders on elegant.解决方案You can use
mutate_each
for that:mydf %>% group_by(round) %>% mutate_each(funs(. - .[treatment=="baseline"]), -treatment) %>% filter(treatment!="baseline")
which gives:
Source: local data frame [10 x 4] Groups: round [5] treatment round measurement1 measurement2 (fctr) (int) (dbl) (dbl) 1 treatment1 1 1.558820 -0.6584485 2 treatment2 1 -0.068677 1.3364462 3 treatment1 2 1.769312 -0.2732490 4 treatment2 2 0.801357 -1.4852449 5 treatment1 3 -1.064394 -1.1513703 6 treatment2 3 2.433222 -0.7939903 7 treatment1 4 0.448744 0.1394982 8 treatment2 4 -1.066922 -1.1410085 9 treatment1 5 1.182761 -0.8311095 10 treatment2 5 0.138005 0.2622119
If you want to add the differences to your dataframe (just as @akrun did in his dplyr / tidyr alternative), you could also do:
mydf %>% group_by(round) %>% mutate(diff1 = measurement1 - measurement1[treatment=="baseline"], diff2 = measurement2 - measurement2[treatment=="baseline"]) %>% filter(treatment!="baseline")
which gives:
Source: local data table [10 x 6] treatment round measurement1 measurement2 diff1 diff2 (fctr) (int) (dbl) (dbl) (dbl) (dbl) 1 treatment1 1 2.630392 -0.104258 1.558820 -0.6584485 2 treatment2 1 1.002895 1.890637 -0.068677 1.3364462 3 treatment1 2 3.822473 3.147443 1.769312 -0.2732490 4 treatment2 2 2.854518 1.935447 0.801357 -1.4852449 5 treatment1 3 1.520553 3.291122 -1.064394 -1.1513703 6 treatment2 3 5.018169 3.648502 2.433222 -0.7939903 7 treatment1 4 4.956380 4.544908 0.448744 0.1394982 8 treatment2 4 3.440714 3.264401 -1.066922 -1.1410085 9 treatment1 5 4.672056 5.082310 1.182761 -0.8311095 10 treatment2 5 3.627300 6.175631 0.138005 0.2622119
这篇关于群组成员之间的差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!