pcd <- data.frame(tripNo = c(618, 618, 610, 610, 610, 619),
              procDate = as.Date(c('2016-03-02', '2016-03-03', '2016-03-02', '2016-03-03', '2016-03-02', '2016-03-03')),
              delay = c(7.45, 12.90, 11.88, 6.66, 12.50, 9.41) )

我想标记在两天不同的行程中处理的不一致,其中第二天的延迟比前一天的最后一天短。我现在已经这样做了:
pcd %>%
  arrange(tripNo, procDate, delay) %>%
  group_by(tripNo) %>%
  mutate(delayErr = (row_number() != 1) & (delay < lag(delay)),
         Alert = ifelse(delayErr, '!', '')) %>%
  select(tripNo, procDate, delay, delayErr, Alert)

  tripNo   procDate delay delayErr Alert
   (dbl)     (date) (dbl)    (lgl) (chr)
1    610 2016-03-02 11.88    FALSE
2    610 2016-03-02 12.50    FALSE
3    610 2016-03-03  6.66     TRUE     !
4    618 2016-03-02  7.45    FALSE
5    618 2016-03-03 12.90    FALSE
6    619 2016-03-03  9.41    FALSE

所以这行得通,我的问题是关于我的第一次尝试,我尝试使用 substr:
pcd %>% arrange(tripNo, procDate, delay) %>%
group_by(tripNo) %>%
mutate(delayErr = (row_number() != 1) & (delay < lag(delay)),
       Alert = substr(' !', delayErr + 1, delayErr + 1) ) %>%  # <<< This is the only change
select(tripNo, procDate, delay, delayErr, Alert)

  tripNo   procDate delay delayErr Alert
   (dbl)     (date) (dbl)    (lgl) (chr)
1    610 2016-03-02 11.88    FALSE
2    610 2016-03-02 12.50    FALSE
3    610 2016-03-03  6.66     TRUE
4    618 2016-03-02  7.45    FALSE
5    618 2016-03-03 12.90    FALSE
6    619 2016-03-03  9.41    FALSE

使用此代码,警报未按预期显示。
有人可以向我解释为什么第二个 dplyr 查询不起作用吗?
谢谢!

最佳答案

已经有 substr 的矢量化版本,即 substring

pcd %>%
  arrange(tripNo, procDate, delay) %>%
  group_by(tripNo) %>%
  mutate(delayErr = (row_number() != 1) & (delay < lag(delay)),
         Alert = substring(' !', delayErr +1, delayErr +1)) %>%
  select(tripNo, procDate, delay, delayErr, Alert)
#   tripNo   procDate delay delayErr Alert
#   (dbl)     (date) (dbl)    (lgl) (chr)
#1    610 2016-03-02 11.88    FALSE
#2    610 2016-03-02 12.50    FALSE
#3    610 2016-03-03  6.66     TRUE     !
#4    618 2016-03-02  7.45    FALSE
#5    618 2016-03-03 12.90    FALSE
#6    619 2016-03-03  9.41    FALSE

关于r - dplyr 中的 substr %>% 变异,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/36676502/

10-12 23:36