我有这样的数据:

emailaddress    customer_acquisation_date  customer_order_date  payment_amount
[email protected]     01/05/2013 6:24 AM         01/05/2013 5:10 AM           $ 20.67
[email protected]     01/05/2013 6:24 AM         02/07/2013 7:21 PM           $ 25.56
[email protected]     01/05/2013 6:24 AM         07/10/2013 8:00 AM           $100.00
[email protected]     01/05/2013 6:24 AM         08/12/2013 9:35 AM           $30.00

我想通过电子邮件地址求和(付款金额),我希望最终输出为:
emailaddress    customer_acquisation_date  customer_order_date  payment_amount
[email protected]     01/05/2013 6:24 AM         01/05/2013            $ 177
                                            02/07/2013
                                            07/10/2013
                                            08/12/2013

我正在编写的代码
z <- aggregate(x$emailaddress~x$paymentamount,data=x,FUN=sum)

我收到错误
Error in Summary.factor(c(211594L, 291939L, 79240L, 208971L, 369325L,  :
  ‘sum’ not meaningful for factors

什么是正确的方法。任何帮助表示赞赏

最佳答案

聚合函数首先需要一个值进行聚合,然后是分组参数。如前所述,您还需要删除美元符号才能将列转换为数字格式。

# Remove the dollar sign
x$payment_amount = as.numeric( gsub('[$]', '', x$payment_amount ))

# Write it in the right order .. aggregate(x, by, FUN .. )
z <- aggregate( payment_amount ~ emailaddress, data = x, FUN = sum )

编辑:添加data.table解决方案,同时保留其他列。
 library(data.table)
 setDT(x) # Convert the data.frame to data.table
 z = x[, payment_total := sum(payment_amount), by = emailaddress]
 setDF(z) # Convert the result to data.frame

关于r - 按付款金额汇总,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/46163130/

10-12 20:47