本文介绍了使用常量对data.table更新进行联接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想基于联接和固定值更新data.table中行的子集.

I'd like to update a subset of rows in a data.table based on a join and a fixed value.

d1 <- data.table(A = 5:1, B = letters[5:1])    
d2 <- data.table(C = letters[5:1], Z = 6:10)
current.val <- 5

我要做的是基于与d2的联接更新d1,但仅在d1中A == 5的情况下进行.像这样的东西:

what i want to do is update d1 based on the join with d2, but only where where A==5 in d1. something like this:

d1[d2, D := i.Z ,on=.(B==C, A==current.val)]

我当前的方法是在d2中添加一个新列,并将其设置为固定值,并在联接中使用它:

my current approach is a add a new column to d2 and set it to the fixed value and use that in the join:

d2[, current.val := 5]
d1[d2, D := i.Z ,on=.(B==C, A==current.val)]

这有效,但似乎有很多开销.有没有更简单的方法在联接中使用常量值?

This works, but seem like a lot of overhead. Is there a simpler way to use a constant value in a join?

(8/14)用于基准测试的新比例示例:

(8/14) New scale example for benchmarking:

d1 <- data.table(A = 100:1, B = 100000000:1, D = as.numeric(NA),  key = c("A", "B"))
d2 <- data.table(C = 100000000:1, Z = c(10:1) / 10, key = "C")
current.val <- 5

system.time(d1[cbind(d2, A = current.val), on = .(B = C, A), D := i.Z])
system.time({setkey(d1, B, A); d1[d1[d2][A == current.val], D := Z]; setkey(d1, A, B)})
system.time(d1[d1[d2][A == current.val], D := Z]) # fastest, if inverse key order is acceptable

推荐答案

这是一个好方法.或者,您可以使用cbind在连接内部临时添加一列:

That's a good way to go. Alternately, you could add a column temporarily inside the join with cbind:

d1[cbind(d2, A = current.val), on=.(B = C, A), D := i.Z ]

实际上,在这里c可以代替cbind,但是我觉得这很奇怪.

Actually, c works in place of cbind here, but I find it a weirder approach.

这篇关于使用常量对data.table更新进行联接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-25 00:55