问题描述
假设我有以下 data.table
:
dt <- data.table(id = c(rep(1, 5), rep(2, 4)),
sex = c(rep("H", 5), rep("F", 4)),
fruit = c("apple", "tomato", "apple", "apple", "orange", "apple", "apple", "tomato", "tomato"),
key = "id")
id sex fruit
1: 1 H apple
2: 1 H tomato
3: 1 H apple
4: 1 H apple
5: 1 H orange
6: 2 F apple
7: 2 F apple
8: 2 F tomato
9: 2 F tomato
每一行代表某人(由它的id
和sex
识别)吃了一个fruit
的事实.我想计算每个 fruit
被 sex
吃掉的次数.我可以做到:
Each row represents the fact that someone (identified by it's id
and sex
) ate a fruit
. I want to count the number of times each fruit
has been eaten by sex
. I can do it with :
dt[ , .N, by = c("fruit", "sex")]
这给出了:
fruit sex N
1: apple H 3
2: tomato H 1
3: orange H 1
4: apple F 2
5: tomato F 2
问题是,这样做我会丢失 sex == "F"
的 orange
计数,因为这个计数是 0.有吗一种在不丢失零计数组合的情况下进行聚合的方法?
The problem is, by doing it this way I'm losing the count of orange
for sex == "F"
, because this count is 0. Is there a way to do this aggregation without loosing combinations of zero counts?
明确地说,期望的结果如下:
To be perfectly clear, the desired result would be the following:
fruit sex N
1: apple H 3
2: tomato H 1
3: orange H 1
4: apple F 2
5: tomato F 2
6: orange F 0
非常感谢!
推荐答案
似乎最直接的方法是在传递给 i=
的 data.table 中显式提供所有类别组合,设置 by=.EACHI
对它们进行迭代:
Seems like the most straightforward approach is to explicitly supply all category combos in a data.table passed to i=
, setting by=.EACHI
to iterate over them:
setkey(dt, sex, fruit)
dt[CJ(sex, fruit, unique = TRUE), .N, by = .EACHI]
# sex fruit N
# 1: F apple 2
# 2: F orange 0
# 3: F tomato 2
# 4: H apple 3
# 5: H orange 1
# 6: H tomato 1
这篇关于与 data.table 聚合时保持零计数组合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!