dplyr与子组加入

本文介绍了dplyr与子组加入的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

以下问题可以看作是两列改宽，从 base :: reshape （恐怖）改为 reshape2 。对于两类情况，简单的子组联接效果最佳。

The following problem can be seen as a "two-column reshape to wide", and there are several methods available to solve it the classical way, from base::reshape (horror) to reshape2. For the two-group case, a simple subgroup join works best.

我可以在 dplyr ？下面的示例有点愚蠢，但我需要在更长的管道链中联接，但我不想中断。

Can I reformulate the join within the piping framework of dplyr? The example below is a bit silly, but I needed the join in a longer pipe-chain which I do not want to break.

library(dplyr)
d = data.frame(subject= rep(1:5,each=2),treatment=letters[1:2],bp = rnorm(10))

d %>%
  # Assume piped manipulations here
  # Make wide
  # Assume additional piped manipulations here

# Make wide (old style)
with(d,left_join(d[treatment=="a",],
          d[treatment=="b",],by="subject" ))

`推荐答案`

d %>%
  filter(treatment == "a") %>%
  left_join(., filter(d, treatment == "b"), by = "subject")

#  subject treatment.x       bp.x treatment.y      bp.y
#1       1           a  0.4392647           b 0.6741559
#2       2           a -0.6010311           b 1.9845774
#3       3           a  0.1749082           b 1.7678771
#4       4           a -0.3089731           b 0.4427471
#5       5           a -0.8346091           b 1.7156319

您可以在之后继续使用管道

You could continue the pipe right after the left join.

或者如果您不需要单独的处理列，则可以使用tidyr来完成：

Or if you don't require the separate treatment columns, you could use tidyr to do:

library(tidyr)
d %>% spread(treatment, bp)
#  subject          a         b
#1       1  0.4392647 0.6741559
#2       2 -0.6010311 1.9845774
#3       3  0.1749082 1.7678771
#4       4 -0.3089731 0.4427471
#5       5 -0.8346091 1.7156319

（与使用 d％>％dcast（主题〜治疗，value.var = bp） reshape2 包的$ c>）

(which is the same as using d %>% dcast(subject ~ treatment, value.var = "bp") from reshape2 package as noted by Henrik in the comments)

                        这篇关于dplyr与子组加入的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！