本文介绍了data.table 聚合到列表列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!


我正在尝试从 data.table 聚合数据以创建一个新列,该列是先前行的列表.通过示例更容易看到:

I'm trying to aggregate a data from a data.table to create a new column which is a list of previous rows. It's easier to see by example:

dt <- data.table(id = c(1,1,1,1,2,2,3,3,3), letter = c('a','a','b','c','a','c','b','b','a'))


I would like to aggregate this in such a ways that the result should be

   id  letter
1:  1 a,a,b,c
2:  2     a,c
3:  3   b,b,a


dt[,j = list(list(letter)), by = id]


but that doesn't work. Oddly enough when I go case by case, for example:

> dt[id == 1,j = list(list(letter)), by = id]

   id      V1
1:  1 a,a,b,c

结果很好...我觉得我在某处缺少 .SD 或类似的东西...

the result is fine... I feel like I'm missing an .SD somewhere or something like that...


Can anybody point me in the right direction?



更新: 行为 DT[, list(list(.)), by=.]有时会导致 R 版本 >= 3.1.0 中的错误结果.这已在 commit #1280 的当前开发版本中修复="https://github.com/Rdatatable/data.table" rel="nofollow">data.table v1.9.3.来自 新闻:

Update: The behaviour DT[, list(list(.)), by=.] sometimes resulted in wrong results in R version >= 3.1.0. This is now fixed in commit #1280 in the current development version of data.table v1.9.3. From NEWS:

  • DT[, list(list(.)), by=.] 在 R >=3.1.0 中也返回正确结果.该错误是由于 R v3.1.0 中最近(欢迎)的更改,其中 list(.) 不会导致 copy.关闭 #481.

通过此更新,I() 不再需要.你可以像以前一样做:DT[, list(list(.)), by=.].

With this update, it's not necessary for I() anymore. You can just do: DT[, list(list(.)), by=.] as before.

这似乎与已知的 错误 #5585.在你的情况下,我认为你可以使用

This seems to be a similar issue as the known bug #5585. In your case, I think you could just use

dt[, paste(letter, collapse=","), by = id]


正如@ilir 指出的那样,如果确实需要获取一个列表(而不是显示的字符),您可以使用错误报告中建议的解决方法:

As @ilir pointed out, if it is actually desirable to get a list (rather than the displayed character), you could use the workaround suggested in the bug report:

dt[, list(list(I(letter))), by = id]

这篇关于data.table 聚合到列表列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-25 02:59