聚合时保持零计数组合

聚合时保持零计数组合

本文介绍了与 data.table 聚合时保持零计数组合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有以下 data.table :

dt <- data.table(id = c(rep(1, 5), rep(2, 4)),
                 sex = c(rep("H", 5), rep("F", 4)),
                 fruit = c("apple", "tomato", "apple", "apple", "orange", "apple", "apple", "tomato", "tomato"),
                 key = "id")

   id sex  fruit
1:  1   H  apple
2:  1   H tomato
3:  1   H  apple
4:  1   H  apple
5:  1   H orange
6:  2   F  apple
7:  2   F  apple
8:  2   F tomato
9:  2   F tomato

每一行代表某人(由它的idsex 识别)吃了一个fruit 的事实.我想计算每个 fruitsex 吃掉的次数.我可以做到:

Each row represents the fact that someone (identified by it's id and sex) ate a fruit. I want to count the number of times each fruit has been eaten by sex. I can do it with :

dt[ , .N, by = c("fruit", "sex")]

这给出了:

    fruit sex N
1:  apple   H 3
2: tomato   H 1
3: orange   H 1
4:  apple   F 2
5: tomato   F 2

问题是,这样做我会丢失 sex == "F"orange 计数,因为这个计数是 0.有吗一种在不丢失零计数组合的情况下进行聚合的方法?

The problem is, by doing it this way I'm losing the count of orange for sex == "F", because this count is 0. Is there a way to do this aggregation without loosing combinations of zero counts?

明确地说,期望的结果如下:

To be perfectly clear, the desired result would be the following:

   fruit sex N
1:  apple   H 3
2: tomato   H 1
3: orange   H 1
4:  apple   F 2
5: tomato   F 2
6: orange   F 0

非常感谢!

推荐答案

似乎最直接的方法是在传递给 i= 的 data.table 中显式提供所有类别组合,设置 by=.EACHI 对它们进行迭代:

Seems like the most straightforward approach is to explicitly supply all category combos in a data.table passed to i=, setting by=.EACHI to iterate over them:

setkey(dt, sex, fruit)
dt[CJ(sex, fruit, unique = TRUE), .N, by = .EACHI]
#    sex  fruit N
# 1:   F  apple 2
# 2:   F orange 0
# 3:   F tomato 2
# 4:   H  apple 3
# 5:   H orange 1
# 6:   H tomato 1

这篇关于与 data.table 聚合时保持零计数组合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-31 03:40