本文介绍了如何汇总分解为多列的数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个数据集,其中包含选择尽可能多的应用问题的答案(&Q;),每个可能的答案都在单独的列中。那么,假设我们的问题是您可以接受什么颜色的衬衫?它看起来是这样的:
id Q3_Red Q3_Blue Q3_Green Q3_Purple
9
8 Green Purple
7 Green
6 Red
5 Purple
4 Blue
3 Blue Purple
2 Red Blue Green
1 Red Purple
10 Red Purple
您可以使用以下命令将其制作为实际数据框:
tmp <- data.frame("id" = c(009,008,007,006,005,004,003,002,001,010), "Q3_Red" = c("","","","Red","","","","Red","Red","Red"), "Q3_Blue" = c("","","","","","Blue","Blue","Blue","",""),
"Q3_Green" = c("","Green","Green","","","","","Green","",""),
"Q3_Purple" = c("","Purple","","","Purple","","Purple","","Purple","Purple")
)
我想用每个答案的计数来总结它,例如
Red 4
Blue 3
Green 3
Purple 5
我可以用tmp %>% count(Q3_Red)
这样的方法计算每个数据框的数量,并将它们组织到各自的数据框中,但似乎必须有一种方法可以使用重塑函数来一举完成这项工作。我看过gather()
和spread()
,但我想不通如何将tidyr
和count()
组合在一起。
推荐答案
dplyr
和tidyr
您的朋友在这里吗:
library(dplyr)
library(tidyr)
tmp %>%
pivot_longer(cols = -id, values_to = "response") %>% # pivot all columns but id
filter(response != "") %>% # remove blanks
group_by(response) %>% # group by response
summarize(count = n()) # summarize and count
# A tibble: 4 x 2
value count
<chr> <int>
1 Blue 3
2 Green 3
3 Purple 5
4 Red 4
这篇关于如何汇总分解为多列的数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!