本文介绍了直到当前行的一列中唯一值的累积数量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个数据帧donorInfo
,其中包含施主信息:
I have a data frame, donorInfo
, with donor information:
id giftdate giftamt
002 2001-01-05 25.00
033 2001-05-08 50.00
054 2001-09-22 125.00
125 2001-11-05 40.00
042 2001-12-04 75.00
... ... ...
我想创建一列,以显示截至该日期为止唯一捐助者ID的累计数量.我认为是这样的:
I'd like to create a column that shows the cumulative number of unique donor id's up to that date. I think it's something like:
donorInfo$numUnique <- apply/lapply (donorInfo, 1, FUN=nrow(unique(donorInfo$id)))
不幸的是,这不起作用,我想知道如何补救.感谢您的任何建议.
unfortunately this isn't working and I'm wondering how to remedy things. Thanks for any suggestions.
推荐答案
您可以使用duplicated()
和cumsum()
进行此操作(利用布尔值逻辑向量可以强制转换为数字向量这一事实):
You can do this with duplicated()
and cumsum()
(taking advantage of the fact that Boolean-valued logical vectors can be coerced to numeric vectors):
# Example data.frame with some duplicated ids
df <- read.table(text="
id giftdate giftamt
2 2001-01-05 25
33 2001-05-08 50
2 2001-09-22 125
33 2001-11-05 40
42 2001-12-04 75", header=T)
cumsum(!duplicated(df$id))
# [1] 1 2 2 2 3
这篇关于直到当前行的一列中唯一值的累积数量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!