我有这个数据集
CASHPOINT_ID DT status QT_REC
1 N053360330 2016-01-01 end_of_day 5
2 N053360330 2016-01-01 end_of_day 2
3 N053360330 2016-01-02 before 9
4 N053360330 2016-01-02 before NA
5 N053360330 2016-01-03 end_of_day 16
6 N053360330 2016-01-03 end_of_day NA
我只想聚合没有将列状态标记为“之前”的行,并保持不变。结果数据集应如下所示
CASHPOINT_ID DT status QT_REC
1 N053360330 2016-01-01 end_of_day 7
3 N053360330 2016-01-02 before 9
4 N053360330 2016-01-02 before NA
5 N053360330 2016-01-03 end_of_day 16
谢谢。
最佳答案
使用data.table
假设您的原始数据称为dt
并且已经是setDT()
,则可以执行以下操作:
df <- rbind(
dt[status == "end_of_day", .(QT_REC = sum(QT_REC, na.rm = TRUE)),
by = .(CASHPOINT_ID, DT, status)],
dt[status != "end_of_day"]
)[order(DT)]
print(df)
CASHPOINT_ID DT status QT_REC
1: N053360330 2016-01-01 end_of_day 7
2: N053360330 2016-01-02 before 9
3: N053360330 2016-01-02 before NA
4: N053360330 2016-01-03 end_of_day 16
关于r - 仅基于其他列的值汇总行,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/48319666/