假设我有以下两个data.tables
:
mult.year <- data.table(id=c(1,1,1,2,2,2,3,3,3),
time=rep(1:3, 3),
A=rnorm(9),
B=rnorm(9))
setkey(mult.year, id)
single <- data.table(id=c(1,2,3),
C.3=rnorm(3))
setkey(single, id)
我想加入两个表,以便仅对
C.3
显示变量mult.year[time == 3]
我可以通过分配一个新列来做到这一点:
mult.year[time == 3, C := single[,C.3]]
但是我失去了
join
功能:它要求所有id
都在两个数据集中。有没有办法在保持联接功能的同时做到这一点?使用上面的表,我试图得到这个:
id time A B C.3
1: 1 1 -1.0460085 0.0896452 NA
2: 1 2 0.2054772 1.5631978 NA
3: 1 3 -1.7574449 0.5661457 0.6495645
4: 2 1 0.4171095 -0.2182779 NA
5: 2 2 -0.9238671 0.8263605 NA
6: 2 3 -0.5452715 -0.5842541 -1.5233764
7: 3 1 0.1793009 1.4399366 NA
8: 3 2 0.3438980 1.7419869 NA
9: 3 3 0.1067989 0.7630496 1.9658157
最佳答案
如果您愿意在数据表的键中包含time
,则可以执行以下操作:
## Add time ...
setkeyv(mult.year, c("id", "time")) ## ... to mult.year's key
single <- data.table(id=c(1,2,3), time=3, C.3=rnorm(3)) ## ... and to indexing dt
## Which will set up a simple call to [.data.table
mult.year[single, C.3:=C.3]
mult.year
# id time A B C.3
# 1: 1 1 -0.6264538 -0.30538839 NA
# 2: 1 2 0.1836433 1.51178117 NA
# 3: 1 3 -0.8356286 0.38984324 0.61982575
# 4: 2 1 1.5952808 -0.62124058 NA
# 5: 2 2 0.3295078 -2.21469989 NA
# 6: 2 3 -0.8204684 1.12493092 -0.05612874
# 7: 3 1 0.4874291 -0.04493361 NA
# 8: 3 2 0.7383247 -0.01619026 NA
# 9: 3 3 0.5757814 0.94383621 -0.15579551
另外,要保留
single
和当前密钥不变,请使用上面mnel注释中建议的方法:mult.year[single, C.3 := ifelse(time==3,C.3,NA)]
mult.year
# id time A B C.3
# 1: 1 1 -0.6264538 -0.30538839 NA
# 2: 1 2 0.1836433 1.51178117 NA
# 3: 1 3 -0.8356286 0.38984324 0.8212212
# 4: 2 1 1.5952808 -0.62124058 NA
# 5: 2 2 0.3295078 -2.21469989 NA
# 6: 2 3 -0.8204684 1.12493092 0.5939013
# 7: 3 1 0.4874291 -0.04493361 NA
# 8: 3 2 0.7383247 -0.01619026 NA
# 9: 3 3 0.5757814 0.94383621 0.9189774
关于r - Data.table:将“长格式”多时间点表与单个时间点表连接,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/22291392/