我有一些民意调查的数据,如下所示:

                                    Freetime_activities
1                       Travelling, On the PC, Clubbing
2                           Sports, On the PC, Clubbing
3                                              Clubbing
4                                             On the PC
5                       Travelling, On the PC, Clubbing
6                                             On the PC
7                               Watching TV, Travelling


我想获取每个值的计数(在Traveling /在PC上等次数),但是在拆分值时遇到了麻烦。 R中是否有可以执行的功能,例如:

split("A,B,C") ->
1 A
2 B
3 C


还是有直接解决此问题的直接方法?

最佳答案

我们可以使用strsplit用定界符", "分隔列,使用unlist list输出,然后使用table获取频率

 tbl <- table(unlist(strsplit(as.character(df1$Freetime_activities),
                                          ", ")))
 as.data.frame(tbl)
 #         Var1 Freq
 #1    Clubbing    4
 #2   On the PC    5
 #3      Sports    1
 #4  Travelling    3
 #5 Watching TV    1



注意:如果列是as.character,则此处使用factor,因为strsplit只能接受character向量。
或者另一个选择是使用scan提取元素,然后使用table获取频率。
 table(trimws(scan(text = as.character(df1$Freetime_activities),
                   what = "", sep = ",")))

或将read.tableunlisttable一起使用
table(unlist(read.table(text = as.character(df1$Freetime_activities),
           sep = ",", fill = TRUE, strip.white = TRUE)))

编辑:基于@David Arenburg的评论。
数据
df1 <- structure(list(Freetime_activities = c("Travelling, On the PC,
  Clubbing",
"Sports, On the PC, Clubbing", "Clubbing", "On the PC", "Travelling,
 On the PC, Clubbing",
"On the PC", "Watching TV, Travelling")),
 .Names = "Freetime_activities",
 class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7"))

10-06 14:12
查看更多