

我喜欢 reshape2 包,因为它让生活变得如此轻松.通常情况下,Hadley 对他以前的软件包进行了改进,以实现简化、更快运行的代码.我想我应该尝试一下 tidyr,从我读到的内容来看,我认为 gatherreshape2 中的 melt 非常相似.强>.但是在阅读了文档后,我无法让 gather 执行与 melt 相同的任务.

I love the reshape2 package because it made life so doggone easy. Typically Hadley has made improvements in his previous packages that enable streamlined, faster running code. I figured I'd give tidyr a whirl and from what I read I thought gather was very similar to melt from reshape2. But after reading the documentation I can't get gather to do the same task that melt does.


以下是数据视图(文章末尾dput 形式的实际数据):

Here's a view of the data (actual data in dput form at end of post):

  teacher yr1.baseline     pd yr1.lesson1 yr1.lesson2 yr2.lesson1 yr2.lesson2 yr2.lesson3
1       3      1/13/09 2/5/09      3/6/09     4/27/09     10/7/09    11/18/09      3/4/10
2       7      1/15/09 2/5/09      3/3/09      5/5/09    10/16/09    11/18/09      3/4/10
3       8      1/27/09 2/5/09      3/3/09     4/27/09     10/7/09    11/18/09      3/5/10


这是 melt 方式的代码,我尝试 gather.如何让 gathermelt 做同样的事情?

Here's the code in melt fashion, my attempt at gather. How can I make gather do the same thing as melt?

library(reshape2); library(dplyr); library(tidyr)

dat %>%
   melt(id=c("teacher", "pd"), value.name="date")

dat %>%
   gather(key=c(teacher, pd), value=date, -c(teacher, pd))


   teacher     pd     variable     date
1        3 2/5/09 yr1.baseline  1/13/09
2        7 2/5/09 yr1.baseline  1/15/09
3        8 2/5/09 yr1.baseline  1/27/09
4        3 2/5/09  yr1.lesson1   3/6/09
5        7 2/5/09  yr1.lesson1   3/3/09
6        8 2/5/09  yr1.lesson1   3/3/09
7        3 2/5/09  yr1.lesson2  4/27/09
8        7 2/5/09  yr1.lesson2   5/5/09
9        8 2/5/09  yr1.lesson2  4/27/09
10       3 2/5/09  yr2.lesson1  10/7/09
11       7 2/5/09  yr2.lesson1 10/16/09
12       8 2/5/09  yr2.lesson1  10/7/09
13       3 2/5/09  yr2.lesson2 11/18/09
14       7 2/5/09  yr2.lesson2 11/18/09
15       8 2/5/09  yr2.lesson2 11/18/09
16       3 2/5/09  yr2.lesson3   3/4/10
17       7 2/5/09  yr2.lesson3   3/4/10
18       8 2/5/09  yr2.lesson3   3/5/10


dat <- structure(list(teacher = structure(1:3, .Label = c("3", "7",
    "8"), class = "factor"), yr1.baseline = structure(1:3, .Label = c("1/13/09",
    "1/15/09", "1/27/09"), class = "factor"), pd = structure(c(1L,
    1L, 1L), .Label = "2/5/09", class = "factor"), yr1.lesson1 = structure(c(2L,
    1L, 1L), .Label = c("3/3/09", "3/6/09"), class = "factor"), yr1.lesson2 = structure(c(1L,
    2L, 1L), .Label = c("4/27/09", "5/5/09"), class = "factor"),
        yr2.lesson1 = structure(c(2L, 1L, 2L), .Label = c("10/16/09",
        "10/7/09"), class = "factor"), yr2.lesson2 = structure(c(1L,
        1L, 1L), .Label = "11/18/09", class = "factor"), yr2.lesson3 = structure(c(1L,
        1L, 2L), .Label = c("3/4/10", "3/5/10"), class = "factor")), .Names = c("teacher",
    "yr1.baseline", "pd", "yr1.lesson1", "yr1.lesson2", "yr2.lesson1",
    "yr2.lesson2", "yr2.lesson3"), row.names = c(NA, -3L), class = "data.frame")


您的 gather 行应如下所示:

Your gather line should look like:

dat %>% gather(variable, date, -teacher, -pd)

这表示收集除 teacherpd 之外的所有变量,调用新的键列 'variable' 和新的值列 'date'."

This says "Gather all variables except teacher and pd, calling the new key column 'variable' and the new value column 'date'."

作为解释,请注意 help(gather) 页面中的以下内容:

As an explanation, note the following from the help(gather) page:

 ...: Specification of columns to gather. Use bare variable names.
      Select all variables between x and z with ‘x:z’, exclude y
      with ‘-y’. For more options, see the select documentation.

由于这是一个省略号,要收集的列的规范作为单独的(裸名)参数给出.我们希望收集除teacherpd 之外的所有列,因此我们使用-.

Since this is an ellipsis, the specification of columns to gather is given as separate (bare name) arguments. We wish to gather all columns except teacher and pd, so we use -.


