问题描述
我想使用 data.table
进行一些争论,并且希望我生成的数据表不包含分组变量.
I'd like to use data.table
to do some wrangling and would like my resulting data table to not include the grouping variable.
这是MWE:
library("data.table")
DT <- data.table(x = 1:10, grp = rep(1:2,5))
DT[, .(mmm = mean(x)), by = grp]
这将产生:
grp mmm
1: 1 5
2: 2 6
这很好.但是,我希望 grp
不在这里.可以通过链接 data.table
调用并设置 grp:= NULL
或只是将变量丢弃来解决此问题,但是我可以在第一次调用中阻止它,所以我只能返回 mmm
?
which is all fine. However, I'd prefer the grp
not to be here. This can be fixed by chaining the data.table
calls and setting grp := NULL
or just throwing the variable away, but can I prevent it in the first call so I only return mmm
?
推荐答案
目前尚不清楚为什么您不想使用它.使用 DT [,.(mmm = mean(x)),by = grp] [,grp:= NULL] []
是我的首选.
It isn't clear why you don't want to use this. Using DT[, .(mmm = mean(x)), by = grp][, grp := NULL][]
would be my first choice.
尽管我不建议这样做,但您也可以使用:
Although I won't advise it, you can also use:
DT[, .(mmm = DT[, .(mmm = mean(x)), by = grp]$mmm)]
这也将为您提供所需的结果:
which will give you the desired result as well:
mmm
1: 5
2: 6
尽管您将获得相同的结果,但最好不要使用此方法.这样做的主要缺点是,当您要汇总除value列之外的内容时,会使您的代码变得不必要的复杂.然后,您将得到类似的内容:
Although you will get the same result, it is better not to use this method. The major drawback of this is that you will make your code unnecessary complicated when you want to summarise more than value column. You would then get something like:
DT[, .(mx = DT[, .(mx = mean(x)), by = grp]$mx, my = DT[, .(my = mean(y)), by = grp]$my)]
使用常规 data.table-way 时
为:
DT[, .(mx = mean(x), my = mean(y)), by = grp][, grp := NULL][]
总结:
使用 DT [,.(mmm = mean(x)),by = grp] [,grp:= NULL] []
方法将是您的最佳选择.
Using the DT[, .(mmm = mean(x)), by = grp][, grp := NULL][]
method would thus be your best choice.
这篇关于删除data.table的分组变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!