我有这个数据集

  CASHPOINT_ID         DT     status   QT_REC
1   N053360330 2016-01-01 end_of_day      5
2   N053360330 2016-01-01 end_of_day      2
3   N053360330 2016-01-02 before          9
4   N053360330 2016-01-02 before         NA
5   N053360330 2016-01-03 end_of_day     16
6   N053360330 2016-01-03 end_of_day     NA

我只想聚合没有将列状态标记为“之前”的行,并保持不变。结果数据集应如下所示
 CASHPOINT_ID         DT     status       QT_REC
    1   N053360330 2016-01-01 end_of_day      7
    3   N053360330 2016-01-02 before          9
    4   N053360330 2016-01-02 before         NA
    5   N053360330 2016-01-03 end_of_day     16

谢谢。

最佳答案

使用data.table
假设您的原始数据称为dt并且已经是setDT(),则可以执行以下操作:

df <- rbind(
  dt[status == "end_of_day", .(QT_REC = sum(QT_REC, na.rm = TRUE)),
     by = .(CASHPOINT_ID, DT, status)],
  dt[status != "end_of_day"]
)[order(DT)]

print(df)
   CASHPOINT_ID         DT     status QT_REC
1:   N053360330 2016-01-01 end_of_day      7
2:   N053360330 2016-01-02     before      9
3:   N053360330 2016-01-02     before     NA
4:   N053360330 2016-01-03 end_of_day     16

关于r - 仅基于其他列的值汇总行,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/48319666/

10-12 18:53