我有一些民意调查的数据,如下所示:
Freetime_activities
1 Travelling, On the PC, Clubbing
2 Sports, On the PC, Clubbing
3 Clubbing
4 On the PC
5 Travelling, On the PC, Clubbing
6 On the PC
7 Watching TV, Travelling
我想获取每个值的计数(在Traveling /在PC上等次数),但是在拆分值时遇到了麻烦。 R中是否有可以执行的功能,例如:
split("A,B,C") ->
1 A
2 B
3 C
还是有直接解决此问题的直接方法?
最佳答案
我们可以使用strsplit
用定界符", "
分隔列,使用unlist
list
输出,然后使用table
获取频率
tbl <- table(unlist(strsplit(as.character(df1$Freetime_activities),
", ")))
as.data.frame(tbl)
# Var1 Freq
#1 Clubbing 4
#2 On the PC 5
#3 Sports 1
#4 Travelling 3
#5 Watching TV 1
注意:如果列是
as.character
,则此处使用factor
,因为strsplit
只能接受character
向量。或者另一个选择是使用
scan
提取元素,然后使用table
获取频率。 table(trimws(scan(text = as.character(df1$Freetime_activities),
what = "", sep = ",")))
或将
read.table
与unlist
和table
一起使用table(unlist(read.table(text = as.character(df1$Freetime_activities),
sep = ",", fill = TRUE, strip.white = TRUE)))
编辑:基于@David Arenburg的评论。
数据
df1 <- structure(list(Freetime_activities = c("Travelling, On the PC,
Clubbing",
"Sports, On the PC, Clubbing", "Clubbing", "On the PC", "Travelling,
On the PC, Clubbing",
"On the PC", "Watching TV, Travelling")),
.Names = "Freetime_activities",
class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7"))