问题描述
我喜欢 reshape2 包,因为它让生活变得如此轻松.通常情况下,Hadley 对他以前的软件包进行了改进,以实现简化、更快运行的代码.我想我应该尝试一下 tidyr,从我读到的内容来看,我认为 gather
与 reshape2 中的 melt
非常相似.强>.但是在阅读了文档后,我无法让 gather
执行与 melt
相同的任务.
I love the reshape2 package because it made life so doggone easy. Typically Hadley has made improvements in his previous packages that enable streamlined, faster running code. I figured I'd give tidyr a whirl and from what I read I thought gather
was very similar to melt
from reshape2. But after reading the documentation I can't get gather
to do the same task that melt
does.
数据视图
以下是数据视图(文章末尾dput
形式的实际数据):
Here's a view of the data (actual data in dput
form at end of post):
teacher yr1.baseline pd yr1.lesson1 yr1.lesson2 yr2.lesson1 yr2.lesson2 yr2.lesson3
1 3 1/13/09 2/5/09 3/6/09 4/27/09 10/7/09 11/18/09 3/4/10
2 7 1/15/09 2/5/09 3/3/09 5/5/09 10/16/09 11/18/09 3/4/10
3 8 1/27/09 2/5/09 3/3/09 4/27/09 10/7/09 11/18/09 3/5/10
代码
这是 melt
方式的代码,我尝试 gather
.如何让 gather
与 melt
做同样的事情?
Here's the code in melt
fashion, my attempt at gather
. How can I make gather
do the same thing as melt
?
library(reshape2); library(dplyr); library(tidyr)
dat %>%
melt(id=c("teacher", "pd"), value.name="date")
dat %>%
gather(key=c(teacher, pd), value=date, -c(teacher, pd))
期望输出
teacher pd variable date
1 3 2/5/09 yr1.baseline 1/13/09
2 7 2/5/09 yr1.baseline 1/15/09
3 8 2/5/09 yr1.baseline 1/27/09
4 3 2/5/09 yr1.lesson1 3/6/09
5 7 2/5/09 yr1.lesson1 3/3/09
6 8 2/5/09 yr1.lesson1 3/3/09
7 3 2/5/09 yr1.lesson2 4/27/09
8 7 2/5/09 yr1.lesson2 5/5/09
9 8 2/5/09 yr1.lesson2 4/27/09
10 3 2/5/09 yr2.lesson1 10/7/09
11 7 2/5/09 yr2.lesson1 10/16/09
12 8 2/5/09 yr2.lesson1 10/7/09
13 3 2/5/09 yr2.lesson2 11/18/09
14 7 2/5/09 yr2.lesson2 11/18/09
15 8 2/5/09 yr2.lesson2 11/18/09
16 3 2/5/09 yr2.lesson3 3/4/10
17 7 2/5/09 yr2.lesson3 3/4/10
18 8 2/5/09 yr2.lesson3 3/5/10
数据
dat <- structure(list(teacher = structure(1:3, .Label = c("3", "7",
"8"), class = "factor"), yr1.baseline = structure(1:3, .Label = c("1/13/09",
"1/15/09", "1/27/09"), class = "factor"), pd = structure(c(1L,
1L, 1L), .Label = "2/5/09", class = "factor"), yr1.lesson1 = structure(c(2L,
1L, 1L), .Label = c("3/3/09", "3/6/09"), class = "factor"), yr1.lesson2 = structure(c(1L,
2L, 1L), .Label = c("4/27/09", "5/5/09"), class = "factor"),
yr2.lesson1 = structure(c(2L, 1L, 2L), .Label = c("10/16/09",
"10/7/09"), class = "factor"), yr2.lesson2 = structure(c(1L,
1L, 1L), .Label = "11/18/09", class = "factor"), yr2.lesson3 = structure(c(1L,
1L, 2L), .Label = c("3/4/10", "3/5/10"), class = "factor")), .Names = c("teacher",
"yr1.baseline", "pd", "yr1.lesson1", "yr1.lesson2", "yr2.lesson1",
"yr2.lesson2", "yr2.lesson3"), row.names = c(NA, -3L), class = "data.frame")
推荐答案
您的 gather
行应如下所示:
Your gather
line should look like:
dat %>% gather(variable, date, -teacher, -pd)
这表示收集除 teacher
和 pd
之外的所有变量,调用新的键列 'variable' 和新的值列 'date'."
This says "Gather all variables except teacher
and pd
, calling the new key column 'variable' and the new value column 'date'."
作为解释,请注意 help(gather)
页面中的以下内容:
As an explanation, note the following from the help(gather)
page:
...: Specification of columns to gather. Use bare variable names.
Select all variables between x and z with ‘x:z’, exclude y
with ‘-y’. For more options, see the select documentation.
由于这是一个省略号,要收集的列的规范作为单独的(裸名)参数给出.我们希望收集除teacher
和pd
之外的所有列,因此我们使用-
.
Since this is an ellipsis, the specification of columns to gather is given as separate (bare name) arguments. We wish to gather all columns except teacher
and pd
, so we use -
.
这篇关于比较收集(tidyr)到融化(reshape2)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!