


I am wanting to split a character vector column into multiple rows (of the same dataframe), while maintaining other columns (keep) in this reproducible example:

dat<-structure(list(ID = c("E87", "E42", "E39", "E16,E17,E18", "E760,E761,E762"), keep = 1:5), row.names = c(NA, 5L), class = "data.frame")
> dat
              ID keep
1            E87    1
2            E42    2
3            E39    3
4    E16,E17,E18    4
5 E760,E761,E762    5

当然我们可以将 ID strsplit ,但输出为列表格式(由于某种原因,这总是让我感到困惑),并且没有列 keep

Of course we can split ID with strsplit, but the output is in list format (which is always confusing to me for some reason), and without the column keep

strsplit(dat$ID, ",")

[1] "E87"

[1] "E42"

[1] "E39"

[1] "E16"  " E17" " E18"

[1] "E760" "E761" "E762"

使用 unlist 我可以将此输出返回到向量中,但是现在肯定会丢失订单,以便能够将 keep 重新组合ID

Using unlist I can get this output back into a vector, but now the order will surely be lost to be able to recombine keep with ID.

unlist(strsplit(dat$ID, ","))

[1] "E87"  "E42"  "E39"  "E16"  " E17" " E18" "E760" "E761" "E762"


Any thoughts as to how I might get this output:

> dat
              ID keep
1            E87    1
2            E42    2
3            E39    3
4            E16    4
5            E17    4
6            E18    4
7            E760   5
8            E761   5
9            E762   5


更简单的选择是 separate_rows

separate_rows(dat, ID)
#    ID keep
#1  E87    1
#2  E42    2
#3  E39    3
#4  E16    4
#5  E17    4
#6  E18    4
#7 E760    5
#8 E761    5
#9 E762    5

或者使用OP的方法,在拆分 ID后,用'保持'列,然后堆叠到两列data.frame

Or using the OP's method, after splitting the 'ID', name it with 'keep' column and then stack it to a two column data.frame

stack(setNames(strsplit(dat$ID, ","), dat$keep))


09-05 16:25