问题描述
我有一个大的data.frame,我想为其生成一个新列(称为Seq),该列具有一个顺序值,每次在不同的列中有更改时,该值都会重新开始。这是data.frame(带有省略的列)和名为Seq的新列的示例。如您所见,有一个后续计数,但是每当有一个新的IDPath时,后续计数都会重新启动。
序贯长度可以有不同的长度,有些长度为1,而另一些长度为300。
I have a big data.frame that I want to generate a new column (called Seq) to, which has a sequential values that restarts every time there is a change in a different column. Here is an example of the data.frame (with omitted columns) and the new column called Seq. As you can see there is a sequentiel count, but everytime there is a new IDPath, the sequentiel count restarts.The sequentiel length can have different lengths, some are 1 long, while others are 300.
IDPath LogTime Seq
AADS 19-06-2015 01:57 1
AADS 19-06-2015 01:55 2
AADS 19-06-2015 01:54 3
AADS 19-06-2015 01:53 4
DHSD 19-06-2015 12:57 1
DHSD 19-06-2015 10:58 2
DHSD 19-06-2015 09:08 3
DHSD 19-06-2015 08:41 4
推荐答案
强制性Hadleyverse答案(在Hadleyvese答案之后还包括基数R答案):
Obligatory Hadleyverse answer (base R answer also included after Hadleyvese answer):
library(dplyr)
dat <- read.table(text="IDPath LogTime
AADS '19-06-2015 01:57'
AADS '19-06-2015 01:55'
AADS '19-06-2015 01:54'
AADS '19-06-2015 01:53'
DHSD '19-06-2015 12:57'
DHSD '19-06-2015 10:58'
DHSD '19-06-2015 09:08'
DHSD '19-06-2015 08:41' ", header=TRUE, stringsAsFactors=FALSE, quote="'")
mutate(group_by(dat, IDPath), Seq=1:n())
OR(通过David Arenburg)
OR (via David Arenburg)
mutate(group_by(dat, IDPath), Seq=row_number())
或者如果您正在使用管道:
Or if you're into piping:
dat %>%
group_by(IDPath) %>%
mutate(Seq=1:n())
OR(via大卫·阿伦堡(David Arenburg)
OR (via David Arenburg)
dat %>%
group_by(IDPath) %>%
mutate(Seq=row_number())
强制基数R答案:
unsplit(lapply(split(dat, dat$IDPath), transform, Seq=1:length(IDPath)), dat$IDPath)
或更惯用(再次通过David)
OR more idiomatically (via David again)
with(dat, ave(IDPath, IDPath, FUN = seq_along))
如果确实是一个巨大的数据帧,那么您可能要开始 tbl_dt(dat)
用于 dplyr
解决方案,但如果您已经在使用 data.table
。
If it really is a HUGE data frame then you may want to start with tbl_dt(dat)
for the dplyr
solutions, but CathG's or Jaap's versions will be faster if you're already using data.table
.
这篇关于有条件的数字序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!