本文介绍了使用lubridate根据时间范围/间隔合并表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试根据时间范围合并两个表.我仅在此找到了一些旧答案(例如基于日期范围的数据表合并)不使用lubridate.

I am trying to merge two tables based on time ranges. I only found some old answers on this (e.g. Data Table merge based on date ranges) which don't use lubridate.

实际上,lubridate提供了%within%函数,该函数可以检查日期是否在间隔内.我构建了一个最小的示例,想知道是否存在一种基于重叠的日期/间隔将这些数据帧合并在一起的方法.因此,检查df1$Date是否在df2$interval中.

Actually, lubridate provides the %within% function which can check if a date is within an interval. I constructed a minimal example and wondering if there is a way to merge these data frames together based on the overlapping dates/intervals. So checking if df1$Date is in df2$interval.

library(lubridate)
df1 <- data.frame(Date=c(ymd('20161222'),ymd('20161223'),ymd('20161228'),ymd('20170322')),
                  User=c('a','b','a','a'),
                  Units=c(1,2,3,1))
df2 <- data.frame(User=c('a','b','a'),
                  Start=c(ymd('20140101'), ymd('20140101'), ymd('20170101')),
                  End=c(ymd('20161231'),ymd('20170331'),ymd('20170331')),
                  Price=c(10,10,20))
df2$interval <- interval(df2$Start, df2$End)

我的预期输出将是这样

|   |User |Date       | Units| Price|
|:--|:----|:----------|-----:|-----:|
|1  |a    |2016-12-22 |     1|    10|
|3  |a    |2016-12-28 |     3|    10|
|6  |a    |2017-03-22 |     1|    20|
|7  |b    |2016-12-23 |     2|    10|

推荐答案

对于大型数据框,这可能效率不高(因为您正在创建更大的匹配项和子集),而且我敢肯定还有一种更优雅的方法,但是可行:

This may be inefficient for large dataframes (since you're creating a much larger match and subsetting), and I'm sure there's a more elegant way, but this works:

output <- merge(df1,df2,by="User")[test$Date %within% test$interval,]

或者您可以使用循环:

for(x in 1:length(df1$User)){
  df1$Price[x]<-df2[(df1$Date[x] %within% df2$interval)&df1$User[x]==df2$User,]$Price
}

我确定您也可以创建函数并使用apply ...

I'm sure you could also make a function and use apply...

这篇关于使用lubridate根据时间范围/间隔合并表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-21 05:54