


I'm trying to calculate business hours between two dates. Business hours vary depending on the day.

工作日有15个工作时间( 8:00-23:00 ),星期六和星期日有12个工作时间( 9:00-21:00 ).

Weekdays have 15 business hours (8:00-23:00), saturdays and sundays have 12 business hours (9:00-21:00).

例如:开始日期 07/24/2020 22:20 (星期五)和结束日期 07/25/2020 21:20 (星期六),因为我我只对营业时间感兴趣,结果应该是 12.67 hours.

For example: start date 07/24/2020 22:20 (friday) and end date 07/25/2020 21:20 (saturday), since I'm only interested in the business hours the result should be 12.67hours.


Here an example of the dataframe and desired output:

start_date            end_date            business_hours
07/24/2020 22:20     07/25/2020 21:20        12.67
07/14/2020 21:00     07/16/2020 09:30        18.50
07/18/2020 08:26     07/19/2020 10:00        13.00
07/10/2020 08:00     07/13/2020 11:00        42.00


以下是您可以尝试使用 lubridate 的方法.我编辑了我认为可能会有所帮助的另一个功能.

Here is something you can try with lubridate. I edited another function I had I thought might be helpful.


First create a sequence of dates between the two dates of interest. Then create intervals based on business hours, checking each date if on the weekend or not.

然后,钳位"使用 pmin pmax 的允许的营业时间时间间隔开始和结束时间.

Then, "clamp" the start and end times to the allowed business hours time intervals using pmin and pmax.

您可以使用 time_length 来获取时间间隔的时间量度;总结一下,将为您提供总的使用时间.

You can use time_length to get the time measurement of the intervals; summing them up will give you total time elapsed.


calc_bus_hours <- function(start, end) {
  my_dates <- seq.Date(as.Date(start), as.Date(end), by = "day")

  my_intervals <- if_else(weekdays(my_dates) %in% c("Saturday", "Sunday"),
    interval(ymd_hm(paste(my_dates, "09:00"), tz = "UTC"), ymd_hm(paste(my_dates, "21:00"), tz = "UTC")),
    interval(ymd_hm(paste(my_dates, "08:00"), tz = "UTC"), ymd_hm(paste(my_dates, "23:00"), tz = "UTC")))

  int_start(my_intervals[1]) <- pmax(pmin(start, int_end(my_intervals[1])), int_start(my_intervals[1]))
  int_end(my_intervals[length(my_intervals)]) <- pmax(pmin(end, int_end(my_intervals[length(my_intervals)])), int_start(my_intervals[length(my_intervals)]))

  sum(time_length(my_intervals, "hour"))

calc_bus_hours(as.POSIXct("07/24/2020 22:20", format = "%m/%d/%Y %H:%M", tz = "UTC"), as.POSIXct("07/25/2020 21:20", format = "%m/%d/%Y %H:%M", tz = "UTC"))
[1] 12.66667

编辑:对于西班牙语,请使用 c(sábado","domingo")代替 c("Saturday",星期天")).

Edit: For Spanish language, use c("sábado", "domingo") instead of c("Saturday", "Sunday")

对于数据框示例,您可以使用 mapply 将选定的两个列作为参数来调用该函数.试试:

For the data frame example, you can use mapply to call the function using the two selected columns as arguments. Try:

df$business_hours <- mapply(calc_bus_hours, df$start_date, df$end_date)

                start                 end business_hours
1 2020-07-24 22:20:00 2020-07-25 21:20:00       12.66667
2 2020-07-14 21:00:00 2020-07-16 09:30:00       18.50000
3 2020-07-18 08:26:00 2020-07-19 10:00:00       13.00000
4 2020-07-10 08:00:00 2020-07-13 11:00:00       42.00000


09-05 10:15