我有这种格式的数据

> data = data.table(id = 1:10, date = seq(as.Date("2016-01-01"), by = 1, length = 10))
> data
    id       date
 1:  1 2016-01-01
 2:  2 2016-01-02
 3:  3 2016-01-03
 4:  4 2016-01-04
 5:  5 2016-01-05
 6:  6 2016-01-06
 7:  7 2016-01-07
 8:  8 2016-01-08
 9:  9 2016-01-09
10: 10 2016-01-10

我有另一个矩阵,即我希望执行的查询/子集。
> query = data.table(id = c(1,4,7), date_start = c("2016-01-01", "2016-01-01", "2016-01-01"), date_end = c("2016-01-04", "2016-01-02", "2016-01-03"))
> query
   id date_start   date_end
1:  1 2016-01-01 2016-01-04
2:  4 2016-01-01 2016-01-02
3:  7 2016-01-01 2016-01-03

我希望做这样的事情:
subset(data, (id == query[1] & date > date_start[1] & date < date_end[1]) |
       (id == query[2] & date > date_start[2] & date < date_end[2]) |
       (id == query[3] & date > date_start[3] & date < date_end[3]))

是否有一个自动生成的子集查询,而无需使用for循环和重新绑定(bind)结果。

谢谢

最佳答案

如果我们稍微转换OP的数据以获得

library(data.table)
data = setDT(structure(list(id = 1:10, date = structure(16801:16810, class = c("IDate",
"Date")), date2 = structure(16801:16810, class = c("IDate", "Date"
))), .Names = c("id", "date", "date2"), row.names = c(NA, -10L
), class = c("data.table", "data.frame"), sorted = c("id",
"date", "date2")))

query = setDT(structure(list(id = c(1, 4, 7), date_start =
structure(c(16801L,
16801L, 16801L), class = c("IDate", "Date")), date_end = structure(c(16804L,
16802L, 16803L), class = c("IDate", "Date"))), .Names = c("id",
"date_start", "date_end"), row.names = c(NA, -3L), class = c("data.table",
"data.frame"), sorted = c("id",
"date_start", "date_end")))

...那么我们可以像这样使用foverlaps
foverlaps(data, query, nomatch=0)
#    id date_start   date_end       date      date2
# 1:  1 2016-01-01 2016-01-04 2016-01-01 2016-01-01

对于这种方法,我认为在合并之前需要采取以下步骤:
  • 将所有日期作为IDate
  • 在主数据
  • 中创建一个额外的日期列
  • 在每个表上设置键
  • 09-25 18:25
    查看更多