我有这样的数据:
emailaddress customer_acquisation_date customer_order_date payment_amount
[email protected] 01/05/2013 6:24 AM 01/05/2013 5:10 AM $ 20.67
[email protected] 01/05/2013 6:24 AM 02/07/2013 7:21 PM $ 25.56
[email protected] 01/05/2013 6:24 AM 07/10/2013 8:00 AM $100.00
[email protected] 01/05/2013 6:24 AM 08/12/2013 9:35 AM $30.00
我想通过电子邮件地址求和(付款金额),我希望最终输出为:
emailaddress customer_acquisation_date customer_order_date payment_amount
[email protected] 01/05/2013 6:24 AM 01/05/2013 $ 177
02/07/2013
07/10/2013
08/12/2013
我正在编写的代码
z <- aggregate(x$emailaddress~x$paymentamount,data=x,FUN=sum)
我收到错误
Error in Summary.factor(c(211594L, 291939L, 79240L, 208971L, 369325L, :
‘sum’ not meaningful for factors
什么是正确的方法。任何帮助表示赞赏
最佳答案
聚合函数首先需要一个值进行聚合,然后是分组参数。如前所述,您还需要删除美元符号才能将列转换为数字格式。
# Remove the dollar sign
x$payment_amount = as.numeric( gsub('[$]', '', x$payment_amount ))
# Write it in the right order .. aggregate(x, by, FUN .. )
z <- aggregate( payment_amount ~ emailaddress, data = x, FUN = sum )
编辑:添加data.table解决方案,同时保留其他列。
library(data.table)
setDT(x) # Convert the data.frame to data.table
z = x[, payment_total := sum(payment_amount), by = emailaddress]
setDF(z) # Convert the result to data.frame
关于r - 按付款金额汇总,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/46163130/