我有一个单一的数据框,其中包含多个地区和村庄的广泛数据,其中记录了政党的选票。每个区都有不同的政党:

df_in <- data.frame(
  X1 = c(rep("District1", 3), rep("District2", 3)),
  X2 = c(rep(c("", "Village1", "Village2"), 2)),
  X3 = c("Party1", "30", "11", "Party1", "2", "59"),
  X4 = c("Party2", "55", "42", "Party2", "66", "44"),
  X5 = c("", "", "", "Party3", "32", "13"),
  X6 = c("", "", "", "Party4", "99", "75")
)

我想最终得到一个很长的选票数据集,为每个村庄/地区的每个政党记录:
df_out <- data.frame(
  X1 = c(rep("District1", 4), rep("District2", 8)),
  X2 = c("Village1", "Village1", "Village2", "Village2", "Village1", "Village1", "Village1", "Village1", "Village2", "Village2", "Village2", "Village2"),
  X3 = c(
    rep(c("Party1", "Party2"), 2),
    rep(c("Party1", "Party2", "Party3", "Party4"), 2)
    ),
  X4 = c(30, 55, 11, 42, 2, 66, 32, 99, 59, 44, 13, 75)
)

我想在单个管道中从输入到输出。我一直在进行类似以下设置的工作,但到目前为止没有成功:
df_out <- df_in %>%
  split(.$X1) %>%
  map() %>%
  gather() %>%
  bind_rows()

这是在正确的路线上吗?

最佳答案

 library(tidyverse)
 df_in %>%
  split(.$X1) %>%
  map(. %>% gather(key,val,X3:X6) %>%
        group_by(key) %>% mutate(key1=first(val)) %>% filter(row_number() %in% 2:n() & val!="") %>%
        ungroup() %>% rename(X4=val, X3=key1) %>% select(X1,X2,X3,X4)) %>%
 bind_rows()

关于r - 在单个管道中使用 tidyverse 拆分、重塑、绑定(bind)堆叠的宽数据,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/55273627/

10-12 17:05
查看更多