问题描述
这是我的例子。我正在阅读以下文件:
Here is my example. I am reading the following file: sample_data
library(dplyr)
txt <- c('"", "MDN", "Cl_Date"',
'"1", "A", "2017-04-15 15:10:42.510"',
'"2", "A", "2017-04-01 14:47:23.210"',
'"3", "A", "2017-04-01 14:49:54.063"',
'"4", "B", "2017-04-30 13:25:00.000"',
'"5", "B", "2017-04-03 17:53:13.217"',
'"6", "B", "2017-04-15 15:17:43.780"')
ts <- read.csv(text = txt, as.is = TRUE)
ts$Cl_Date <- as.POSIXct(ts$Cl_Date)
ts <- ts %>% group_by(MDN) %>% arrange(Cl_Date) %>%
mutate(time_diff = c(0,diff(Cl_Date)))
ts <-ts[order(ts$MDN, ts$Cl_Date),]
结果是我有
MDN Cl_Date time_diff
A 4/1/2017 14:47 0
A 4/1/2017 14:49 2.514216665
A 4/15/2017 15:10 20180.80745
B 4/3/2017 17:53 0
B 4/15/2017 15:17 11.89202041
B 4/30/2017 13:25 14.92171551
所以我按MDN列分组并计算Cl_Date列之间的差异。如您所见,有时以分钟为单位的时间差异(A组)和以天为单位的时间差异(B组)。
So I group by MDN column and compute difference between Cl_Date column. As you can see sometime different in minutes (group A) and sometime difference in days (group B).
为什么不同单位的时间差如何校正?
Why is time difference in different units and how to correct it?
PS我无法通过手动创建 data.frame
来复制同一示例,所以我不得不从文件中读取。
P.S. I could not reproduce the same example with manual data.frame
creation, so I had to read from file.
更新1
diff(ts $ Cl_Date)
似乎是一致的,一切都在几分钟之内。
UPDATE 1diff(ts$Cl_Date)
seems to be consistent, everything is in minutes. Does something break within dplyr
?
UPDATE 2
ts <- ts %>% group_by(MDN) %>% arrange(Cl_Date) %>%
mutate(time_diff_2 = Cl_Date-lag(Cl_Date))
会产生相同的结果。
推荐答案
ts <- ts %>% group_by(MDN) %>% arrange(Cl_Date) %>%
mutate(time_diff_2 = as.numeric(Cl_Date-lag(Cl_Date), units = 'mins'))
将时差转换为数值。您可以使用 units
参数使返回值保持一致。
Convert the time difference to a numeric value. You can use units
argument to make the return values consistent.
这篇关于R dplyr中行之间的时差,不同单位的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!