问题描述
我有一些不整洁的数据.它有两个嵌套的重复度量(Q1/Q2 嵌套在构造中).我想把它从宽格式移到长格式.
I have some data that is not tidy. It has two nested repeated measures (Q1/Q2 nested within Constructs). I'd like to move it from wide to long format.
## id time Q1..Ask Q2..Ask Q1..Tell Q2..Tell Q1..Respond Q2..Respond
## 1 1 pre 1 1 1 1 0 0
## 2 2 pre 0 1 1 0 0 1
## 3 3 pre 0 0 1 0 0 0
## 4 4 pre 1 1 0 1 1 0
## 5 5 pre 0 0 0 0 0 0
## 6 1 post 0 0 1 1 0 1
## 7 2 post 0 0 1 1 0 0
## 8 3 post 0 0 0 1 0 0
## 9 4 post 1 0 1 1 0 0
## 10 5 post 0 1 0 1 1 1
这里的问题 1 和问题 2(Q1 和 Q2)是针对同一结构的两个不同问题.所以 Q1..Ask Q2..Ask
是针对 Ask 结构的问题 1 和问题 2 的分数.如何将 Q1/Q2 制作成一列(Question
),并将列标题的后半部分制作成一个 Construct
列,并带有 Score
列使用 tidyr?
Here question 1 and question 2 (Q1 & Q2) are two different questions aimed at the same construct. So Q1..Ask Q2..Ask
are scores for question 1 and 2 targeted at an Ask construct. How can I make the Q1/Q2 into a column (Question
) and the latter part of the column headers into a Construct
column, with a Score
column using tidyr?
# MWE
if (!require("pacman")) install.packages("pacman")
pacman::p_load(dplyr, tidyr)
set.seed(10)
dat <- data_frame(
id = c(1:5, 1:5),
time = rep(c("pre", "post"), each = 5),
Q1..Ask = sample(0:1, 10, TRUE),
Q2..Ask = sample(0:1, 10, TRUE),
Q1..Tell = sample(0:1, 10, TRUE),
Q2..Tell = sample(0:1, 10, TRUE),
Q1..Respond = sample(0:1, 10, TRUE),
Q2..Respond = sample(0:1, 10, TRUE)
)
# 代码使其长格式而不是 tidyr
Map(function(x, y) {
data_frame(
ID = rep(dat[["id"]], 2),
Time = rep(dat[["time"]], 2),
Question = rep(c("Q1", "Q2"), each=10),
Construct = rep(gsub("Q[12]\\.+", "", colnames(dat)[x]), 20),
Score = c(dat[[x]], dat[[y]])
)
}, c(3, 5, 7), c(4, 6, 8)) %>%
rbind_all
# 期望输出
## ID Time Question Construct Score
## 1 1 pre Q1 Ask 1
## 2 2 pre Q1 Ask 0
## 3 3 pre Q1 Ask 0
## 4 4 pre Q1 Ask 1
## 5 5 pre Q1 Ask 0
## 6 1 post Q1 Ask 0
## 7 2 post Q1 Ask 0
## 8 3 post Q1 Ask 0
## 9 4 post Q1 Ask 1
## 10 5 post Q1 Ask 0
## 11 1 pre Q2 Ask 1
## 12 2 pre Q2 Ask 1
## 13 3 pre Q2 Ask 0
## 14 4 pre Q2 Ask 1
## 15 5 pre Q2 Ask 0
## 16 1 post Q2 Ask 0
## 17 2 post Q2 Ask 0
## 18 3 post Q2 Ask 0
## 19 4 post Q2 Ask 0
## 20 5 post Q2 Ask 1
## 21 1 pre Q1 Tell 1
## 22 2 pre Q1 Tell 1
## 23 3 pre Q1 Tell 1
## 24 4 pre Q1 Tell 0
## 25 5 pre Q1 Tell 0
## 26 1 post Q1 Tell 1
## 27 2 post Q1 Tell 1
## 28 3 post Q1 Tell 0
## 29 4 post Q1 Tell 1
## 30 5 post Q1 Tell 0
## 31 1 pre Q2 Tell 1
## 32 2 pre Q2 Tell 0
## 33 3 pre Q2 Tell 0
## 34 4 pre Q2 Tell 1
## 35 5 pre Q2 Tell 0
## 36 1 post Q2 Tell 1
## 37 2 post Q2 Tell 1
## 38 3 post Q2 Tell 1
## 39 4 post Q2 Tell 1
## 40 5 post Q2 Tell 1
## 41 1 pre Q1 Respond 0
## 42 2 pre Q1 Respond 0
## 43 3 pre Q1 Respond 0
## 44 4 pre Q1 Respond 1
## 45 5 pre Q1 Respond 0
## 46 1 post Q1 Respond 0
## 47 2 post Q1 Respond 0
## 48 3 post Q1 Respond 0
## 49 4 post Q1 Respond 0
## 50 5 post Q1 Respond 1
## 51 1 pre Q2 Respond 0
## 52 2 pre Q2 Respond 1
## 53 3 pre Q2 Respond 0
## 54 4 pre Q2 Respond 0
## 55 5 pre Q2 Respond 0
## 56 1 post Q2 Respond 1
## 57 2 post Q2 Respond 0
## 58 3 post Q2 Respond 0
## 59 4 post Q2 Respond 0
## 60 5 post Q2 Respond 1
推荐答案
尝试
library(tidyr)
gather(dat, Var, Score, -id, -time) %>%
extract(Var, c('Question', 'Construct'),
'([^.]+)..([^.]+)')
这篇关于tidyr 宽到长,两次重复测量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!