本文介绍了data.table 未在 by 语句中处理 integer64的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用 fread 从 data.table 加载 integer64 正确,虽然我的印象是 by 语句没有正确处理 int64.我可能在这里做错了什么,这是什么?

库(data.table);库(bit64);test = data.table(x=c(1,2,3),y=c('x','q','q'),ID=as.integer64(c('432706205348805058','432706205348805058','432706205348805059')))str(test) #显示错误(但它是预期的)#Classes 'data.table' 和 'data.frame': 3 obs.3个变量:# $ x : 数字 1 2 3# $ y : 字符 "x" "q" "q"# $ ID:Class 'integer64' num [1:3] 9.52e-280 9.52e-280 9.52e-280# - attr(*, ".internal.selfref")=<externalptr>test # 这里显示正确# x y ID#1:1 x 432706205348805058#2:2 q 432706205348805058#3: 3 q 432706205348805059txtR) 测试$ID整数64[1] 432706205348805058 432706205348805058 432706205348805059txtR) 测试[,list(count=.N),by=ID] #WRRRONG身份证件数1:432706205348805058 3
解决方案

更新:现在在 v1.9.3 中实现(可从 R-Forge 获得),请参阅 新闻:

o bit64::integer64 现在可用于分组和连接,#5369.感谢 James Sams 强调 UPC 和 Clayton Stanley.
提醒:fread() 已经能够检测和读取 integer64 有一段时间了.

在上面的 OP 示例中:

test[, .N, by=ID]# 编号 N#1:432706205348805058 2#2:432706205348805059 1

integer64 还没有实现 data.table 操作,例如 setkeyby.作为第一步,它仅在 fread 中实现(于 2013 年 3 月 6 日首次发布给 CRAN).例如,它可以用作值列.

我可能通过提交与此相关的错误报告(@Arun 链接到的那个)而混淆了问题.严格来说,这不是一个错误,而是一个功能请求.我认为错误列表更像是在下一个版本之前要解决的重要问题".

非常欢迎贡献.

Using fread from data.table load integer64 correctly, though I have the impression that by statements are not handling int64 correctly.I am probably doing someting wrong here, what is it ?

library(data.table); library(bit64);
test = data.table(x=c(1,2,3),y=c('x','q','q'),ID=as.integer64(c('432706205348805058','432706205348805058','432706205348805059')))

str(test) #the display is wrong (BUT IT IS EXPECTED)
#Classes ‘data.table’ and 'data.frame':  3 obs. of  3 variables:
# $ x : num  1 2 3
# $ y : chr  "x" "q" "q"
# $ ID:Class 'integer64'  num [1:3] 9.52e-280 9.52e-280 9.52e-280
# - attr(*, ".internal.selfref")=<externalptr> 

test # Here it is displayed correctly
#   x y                 ID
#1: 1 x 432706205348805058
#2: 2 q 432706205348805058
#3: 3 q 432706205348805059

txtR) test$ID
integer64
[1] 432706205348805058 432706205348805058 432706205348805059

txtR) test[,list(count=.N),by=ID] #WRRRONG
                   ID count
1: 432706205348805058     3
解决方案

Update: This is now implemented in v1.9.3 (available from R-Forge), see NEWS :

On OP's example above:

test[, .N, by=ID]
#                    ID N
# 1: 432706205348805058 2
# 2: 432706205348805059 1


integer64 isn't yet implemented for data.table operations such as setkey or by. It was just implemented in fread only (first released to CRAN on 6 March 2013) as a first step. It could be useful as a value column for example.

I may have confused matters by filing a bug report relating to this (the one @Arun linked to). Strictly speaking, it isn't a bug but a feature request. I think of the bug list more like 'important things to resolve before the next release'.

Contributions are very welcome.

这篇关于data.table 未在 by 语句中处理 integer64的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-20 09:47