本文介绍了R data.table 计算新列,但在开头插入的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 R data.tables 中,我可以使用这种语法来添加一个新列:

In R data.tables, I can use this syntax to add a new column:

> dt <- data.table(a=c(1,2), b=c(3,4))
> dt[, c := a + b]
> dt
   a b c
1: 1 3 4
2: 2 4 6

但是我如何像这样在 dt 的前面插入 c:

But how would I insert c at the front of the dt like so:

   c a b
1: 4 1 3
2: 6 2 4

我查看了SO,发现有人建议data.frames使用cbind,但我使用:=更方便code> 语法在这里,所以我想知道是否有 data.table 认可的方式来执行此操作.我的 data.table 大约有 100 列,所以我不想一一列出.

I looked on SO, and found some people suggesting cbind for data.frames, but it's more convenient for me to use the := syntax here, so I was wondering if there was a data.table sanctioned way of doing this. My data.table has around 100 columns, so I don't want to list them all out.

推荐答案

  1. setcolorder() 现在接受将小于 ncol(DT) 的列移到前面,#592.感谢 @MichaelChirico 的 PR.

data.table (v1.10.5) 的当前开发版本对 setcolorder() 进行了更新,通过接受部分列列表使这种方式更加方便.提供的列先放置,所有未指定的列按现有顺序添加到其后.

Current development version of data.table (v1.10.5) has updates to setcolorder() that make this way more convenient by accepting a partial list of columns. The columns provided are placed first, and then all non-specified columns are added after in the existing order.

这里的开发分支安装说明.

关于开发分支稳定性的注意事项:我已经运行了几个月,以利用 v1.10.5 中 fread() 中的多线程版本(仅此一项就值得如果您处理多 GB .csv 文件,请更新),我没有注意到我的使用有任何错误或回归.

Note regarding development branch stability: I've been running it for several months now to utilize the multi-threaded version in fread() in v1.10.5 (that alone is worth the update if you deal with multi-GB .csv files) and I have not noticed any bugs or regressions for my usage.

library(data.table)
DT <- as.data.table(mtcars)
DT[1:5]

给予

    mpg cyl disp  hp drat    wt  qsec vs am gear carb
1: 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
2: 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
3: 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
4: 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
5: 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2

根据部分列表对列重新排序:

re-order columns based on a partial list:

setcolorder(DT,c("gear","carb"))
DT[1:5]

现在给

   gear carb  mpg cyl disp  hp drat    wt  qsec vs am
1:    4    4 21.0   6  160 110 3.90 2.620 16.46  0  1
2:    4    4 21.0   6  160 110 3.90 2.875 17.02  0  1
3:    4    1 22.8   4  108  93 3.85 2.320 18.61  1  1
4:    3    1 21.4   6  258 110 3.08 3.215 19.44  1  0
5:    3    2 18.7   8  360 175 3.15 3.440 17.02  0  0


如果出于某种原因您不想更新到开发分支,以下适用于以前(和当前的 CRAN)版本.


If for any reason you don't want to update to the development branch, the following works in previous (and current CRAN) versions.

newCols <- c("gear","carb")
setcolorder(DT,c(newCols, setdiff(newCols,colnames(DT)) ## (Per Frank's advice in comments)

## the long way I'd always done before seeing setdiff()
## setcolorder(DT,c(newCols,colnames(DT)[which(!colnames(DT) %in% newCols)]))

这篇关于R data.table 计算新列,但在开头插入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-30 03:03