data.table 是一个很棒的软件包,可惜的是,它会从 checkUsage 生成不必要的警告(代码来自herehere):

> library(compiler)
> compiler::enableJIT(3)
> dt <- data.table(a = c(rep(3, 5), rep(4, 5)), b=1:10, c=11:20, d=21:30, key="a")
> my.func <- function (dt) {
  dt.out <- dt[, lapply(.SD, sum), by = a]
  dt.out[, count := dt[, .N, by=a]$N]
  dt.out
}
> checkUsage(my.func)
<anonymous>: no visible binding for global variable ‘.SD’ (:2)
<anonymous>: no visible binding for global variable ‘a’ (:2)
<anonymous>: no visible binding for global variable ‘count’ (:3)
<anonymous>: no visible binding for global variable ‘.N’ (:3)
<anonymous>: no visible binding for global variable ‘a’ (:3)
> my.func(dt)
Note: no visible binding for global variable '.SD'
Note: no visible binding for global variable 'a'
Note: no visible binding for global variable 'count'
Note: no visible binding for global variable '.N'
Note: no visible binding for global variable 'a'
   a  b  c   d count
1: 3 15 65 115     5
2: 4 40 90 140     5

通过将a替换为by=a可以避免有关by="a"的警告,但是如何处理其他3条警告?

这对我来说很重要,因为这些警告会使屏幕困惑,并掩盖合法警告。由于警告是在调用my.func时(启用JIT编译器时)发出的,而不仅仅是checkUsage发出的警告,因此我倾向于将其称为bug

最佳答案

更新:现在在v1.8.11中已解决。从NEWS:



为了解决列名符号counta的注释,它们都可以用引号引起来(即使在:=的LHS上也是如此)。使用新的R session (因为注释仅是第一次),以下内容现在不产生注释。

$ R
R version 3.0.1 (2013-05-16) -- "Good Sport"
Copyright (C) 2013 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
> require(data.table)
Loading required package: data.table
data.table 1.8.11  For help type: help("data.table")
> library(compiler)
> compiler::enableJIT(3)
[1] 0
> dt <- data.table(a=c(rep(3,5),rep(4,5)), b=1:10, c=11:20, d=21:30, key="a")
> my.func <- function (dt) {
  dt.out <- dt[, lapply(.SD, sum), by = "a"]
  dt.out[, "count" := dt[, .N, by="a"]$N]
  dt.out
}
> my.func(dt)
   a  b  c   d count
1: 3 15 65 115     5
2: 4 40 90 140     5
> checkUsage(my.func)
>

关于r - data.table在checkUsage中不能很好地发挥作用,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/16172216/

10-13 05:53