问题描述
我正在尝试从 data.table 聚合数据以创建一个新列,该列是先前行的列表.通过示例更容易看到:
I'm trying to aggregate a data from a data.table to create a new column which is a list of previous rows. It's easier to see by example:
dt <- data.table(id = c(1,1,1,1,2,2,3,3,3), letter = c('a','a','b','c','a','c','b','b','a'))
我想以这样的方式聚合它,结果应该是
I would like to aggregate this in such a ways that the result should be
id letter
1: 1 a,a,b,c
2: 2 a,c
3: 3 b,b,a
我凭直觉尝试过
dt[,j = list(list(letter)), by = id]
但这不起作用.奇怪的是,当我逐个处理时,例如:
but that doesn't work. Oddly enough when I go case by case, for example:
> dt[id == 1,j = list(list(letter)), by = id]
id V1
1: 1 a,a,b,c
结果很好...我觉得我在某处缺少 .SD
或类似的东西...
the result is fine... I feel like I'm missing an .SD
somewhere or something like that...
谁能指出我正确的方向?
Can anybody point me in the right direction?
谢谢!
推荐答案
更新: 行为 DT[, list(list(.)), by=.]
有时会导致 R 版本 >= 3.1.0 中的错误结果.这已在 commit #1280 的当前开发版本中修复="https://github.com/Rdatatable/data.table" rel="nofollow">data.table v1.9.3.来自 新闻:
Update: The behaviour DT[, list(list(.)), by=.]
sometimes resulted in wrong results in R version >= 3.1.0. This is now fixed in commit #1280 in the current development version of data.table v1.9.3. From NEWS:
DT[, list(list(.)), by=.]
在 R >=3.1.0 中也返回正确结果.该错误是由于 R v3.1.0 中最近(欢迎)的更改,其中list(.)
不会导致 copy.关闭 #481.
通过此更新,I()
不再需要.你可以像以前一样做:DT[, list(list(.)), by=.]
.
With this update, it's not necessary for I()
anymore. You can just do: DT[, list(list(.)), by=.]
as before.
这似乎与已知的 错误 #5585.在你的情况下,我认为你可以使用
This seems to be a similar issue as the known bug #5585. In your case, I think you could just use
dt[, paste(letter, collapse=","), by = id]
解决您的问题.
正如@ilir 指出的那样,如果确实需要获取一个列表(而不是显示的字符),您可以使用错误报告中建议的解决方法:
As @ilir pointed out, if it is actually desirable to get a list (rather than the displayed character), you could use the workaround suggested in the bug report:
dt[, list(list(I(letter))), by = id]
这篇关于data.table 聚合到列表列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!