我有以下两个数据框:

Date <- seq(as.Date("2013/1/1"), by = "day", length.out = 46)
x <-data.frame(Date)
x$discharge <- c("1000","1100","1200","1300","1400","1200","1300","1300","1200","1100","1200","1200","1100","1400","1200","1100","1400","1000","1100","1200","1300","1400","1200","1300","1300","1200","1100","1200","1200","1100","1400","1200","1100","1400","1000","1100","1200","1300","1400","1200","1300","1300","1200","1100","1200","1200")
x$discharge <- as.numeric(x$discharge)


Date_from <- c("2013-01-01","2013-01-15","2013-01-21","2013-02-10")
Date_to <- c("2013-01-07","2013-01-20","2013-01-25","2013-02-15")
y <- data.frame(Date_from,Date_to)
y$concentration <- c("1.5","2.5","1.5","3.5")
y$Date_from <- as.Date(y$Date_from)
y$Date_to <- as.Date(y$Date_to)
y$concentration <- as.numeric(y$concentration)

我试图根据数据帧x中日期范围yDate_from的数据行,从数据帧Date_to中每一行的日排放中计算出平均排放量。请注意,2013-01-08至2013-01-14和2013-01-26至2013-02-09之间的数据帧y中的测量存在差距。该差距是由于在此期间未进行任何测量而造成的。由于我使用以下代码来计算y中每个日期范围的平均出院量,因此这一差距让我头疼:
rng <- cut(x$Date, breaks=c(y$Date_from, max(y$Date_to),
                    include.lowest=T))
range<-cbind(x,rng)
discharge<-aggregate(cbind(mean=x$discharge)~rng, FUN=mean)

但是,如果您在数据帧y中检查范围,则将2013-01-01至2013-01-07的范围扩展到2013-01-14,但我只需要将其扩展到2013-01-07,并且会稍作休息直到下一个范围从2013年1月15日开始。

最佳答案

这是一个base答案:

helper <- merge(x, y)
helper <- helper[helper$Date >= helper$Date_from & helper$Date <= helper$Date_to, ]
aggregate(helper$discharge,
          list(Date_from = helper$Date_from,
               Date_to = helper$Date_to),
          FUN = 'mean')

   Date_from    Date_to        x
1 2013-01-01 2013-01-07 1214.286
2 2013-01-15 2013-01-20 1166.667
3 2013-01-21 2013-01-25 1300.000
4 2013-02-10 2013-02-15 1216.667

关于r - 根据第二个数据框中的日期范围将数据框中的数据分组,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/51157077/

10-12 18:16
查看更多