问题描述
以下问题可以看作是两列改宽,从 base :: reshape (恐怖)改为
reshape2
。对于两类情况,简单的子组联接效果最佳。
The following problem can be seen as a "two-column reshape to wide", and there are several methods available to solve it the classical way, from
base::reshape
(horror) to reshape2
. For the two-group case, a simple subgroup join works best.
我可以在
dplyr ?下面的示例有点愚蠢,但我需要在更长的管道链中联接,但我不想中断。
Can I reformulate the join within the piping framework of
dplyr
? The example below is a bit silly, but I needed the join in a longer pipe-chain which I do not want to break.
library(dplyr)
d = data.frame(subject= rep(1:5,each=2),treatment=letters[1:2],bp = rnorm(10))
d %>%
# Assume piped manipulations here
# Make wide
# Assume additional piped manipulations here
# Make wide (old style)
with(d,left_join(d[treatment=="a",],
d[treatment=="b",],by="subject" ))
推荐答案
d %>%
filter(treatment == "a") %>%
left_join(., filter(d, treatment == "b"), by = "subject")
# subject treatment.x bp.x treatment.y bp.y
#1 1 a 0.4392647 b 0.6741559
#2 2 a -0.6010311 b 1.9845774
#3 3 a 0.1749082 b 1.7678771
#4 4 a -0.3089731 b 0.4427471
#5 5 a -0.8346091 b 1.7156319
您可以在之后继续使用管道
You could continue the pipe right after the left join.
或者如果您不需要单独的处理列,则可以使用tidyr来完成:
Or if you don't require the separate treatment columns, you could use tidyr to do:
library(tidyr)
d %>% spread(treatment, bp)
# subject a b
#1 1 0.4392647 0.6741559
#2 2 -0.6010311 1.9845774
#3 3 0.1749082 1.7678771
#4 4 -0.3089731 0.4427471
#5 5 -0.8346091 1.7156319
(与使用
d%>%dcast(主题〜治疗,value.var = bp) reshape2
包的$ c>)
(which is the same as using
d %>% dcast(subject ~ treatment, value.var = "bp")
from reshape2
package as noted by Henrik in the comments)
这篇关于dplyr与子组加入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!