本文介绍了如何在R中汇总类别变量的唯一值的计数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
假设我有一个数据集data
:
x1 <- c("a","a","a","a","a","a","b","b","b","b")
x2 <- c("a1","a1","a1","a1","a1","a1","b1","b1","b2","b2")
data <- data.frame(x1,x2)
x1 x2
a a1
a a1
a a2
a a1
a a2
a a3
b b1
b b1
b b2
b b2
我想找到对应于x2
例如,a
仅具有3个唯一值(a1,a2
和a3
),而b
具有2个值(b1
和b2
)
For example a
has only 3 unique values (a1,a2
and a3
) and b
has 2 values (b1
and b2
)
我使用了aggregate(x1~.,data,sum)
,但由于这些是因素,而不是整数,所以它不起作用.
I used aggregate(x1~.,data,sum)
but it did not work since these are factors, not integers.
请帮助
推荐答案
尝试
aggregate(x2~x1, data, FUN=function(x) length(unique(x)))
# x1 x2
#1 a 3
#2 b 2
或
rowSums(table(unique(data)))
或
library(dplyr)
data %>%
group_by(x1) %>%
summarise(n=n_distinct(x2))
或@Eric建议的使用dplyr
的其他选项
Or another option using dplyr
suggested by @Eric
count(distinct(data), x1)
或
library(data.table)
setDT(data)[, uniqueN(x2) , x1]
更新
如果您同时需要unique
值'x2'和计数
Update
If you need both the unique
values of 'x2' and the count
setDT(data)[, list(n=uniqueN(x2), x2=unique(x2)) , x1]
或仅unique
值
setDT(data)[, list(x2=unique(x2)) , x1]
或使用dplyr
unique(data, by=x1) %>%
group_by(x1) %>%
mutate(n=n_distinct(x2))
仅适用于唯一值
unique(data, by=x1)
这篇关于如何在R中汇总类别变量的唯一值的计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!