我正在尝试为每个组添加标签。这是数据集。
group
1 p01
2 p01
3 p01
4 p01
5 p02
6 p01
7 p01
8 p01
9 p02
10 p02
11 p01
12 p01
structure(list(group = structure(c(1L,1L,1L,1L,2L,1L,1L,1L,
2L,2L,1L,1L),.Label = c(“p01”,“p02”),class =“factor”)),class =
“data.frame”,row.names = c(NA,
-12L))
这是预期表。在p01的情况下,考虑连续模式,预期列为1-4的1,然后是6-8的2,11-12的3。
group new_group
1 p01 1
2 p01 1
3 p01 1
4 p01 1
5 p02 1
6 p01 2
7 p01 2
8 p01 2
9 p02 2
10 p02 2
11 p01 3
12 p01 3
如何在R中使用dplyr做到这一点?
最佳答案
另一种可能性:
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
df <- structure(list(group = structure(c(1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L), .Label = c("p01", "p02"), class = "factor")), class = "data.frame", row.names = c(NA, -12L))
df %>%
mutate(new_group = with(rle(as.integer(group)), rep(seq_along(lengths), lengths))) %>%
group_by(group) %>%
transmute(new_group = as.integer(as.factor(new_group))) %>%
ungroup()
#> # A tibble: 12 x 2
#> group new_group
#> <fct> <int>
#> 1 p01 1
#> 2 p01 1
#> 3 p01 1
#> 4 p01 1
#> 5 p02 1
#> 6 p01 2
#> 7 p01 2
#> 8 p01 2
#> 9 p02 2
#> 10 p02 2
#> 11 p01 3
#> 12 p01 3
由reprex package(v0.3.0)创建于2019-08-12
关于r - 如何在R中为连续模式标记组?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/57454657/