我正在尝试为每个组添加标签。这是数据集。

   group
1    p01
2    p01
3    p01
4    p01
5    p02
6    p01
7    p01
8    p01
9    p02
10   p02
11   p01
12   p01

structure(list(group = structure(c(1L,1L,1L,1L,2L,1L,1L,1L,
2L,2L,1L,1L),.Label = c(“p01”,“p02”),class =“factor”)),class =
“data.frame”,row.names = c(NA,
-12L))

这是预期表。在p01的情况下,考虑连续模式,预期列为1-4的1,然后是6-8的2,11-12的3。
   group new_group
1    p01         1
2    p01         1
3    p01         1
4    p01         1
5    p02         1
6    p01         2
7    p01         2
8    p01         2
9    p02         2
10   p02         2
11   p01         3
12   p01         3

如何在R中使用dplyr做到这一点?

最佳答案

另一种可能性:

library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#>     filter, lag
#> The following objects are masked from 'package:base':
#>
#>     intersect, setdiff, setequal, union

df <- structure(list(group = structure(c(1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L), .Label = c("p01", "p02"), class = "factor")), class = "data.frame", row.names = c(NA, -12L))

df %>%
  mutate(new_group = with(rle(as.integer(group)), rep(seq_along(lengths), lengths))) %>%
  group_by(group) %>%
  transmute(new_group = as.integer(as.factor(new_group))) %>%
  ungroup()
#> # A tibble: 12 x 2
#>    group new_group
#>    <fct>     <int>
#>  1 p01           1
#>  2 p01           1
#>  3 p01           1
#>  4 p01           1
#>  5 p02           1
#>  6 p01           2
#>  7 p01           2
#>  8 p01           2
#>  9 p02           2
#> 10 p02           2
#> 11 p01           3
#> 12 p01           3

reprex package(v0.3.0)创建于2019-08-12

关于r - 如何在R中为连续模式标记组?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/57454657/

10-09 08:11