在dplyr和mutate中使用strsplit和subset

本文介绍了在dplyr和mutate中使用strsplit和subset的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个带有一个字符串列的数据表。我想使用strsplit创建另一列，该列是该列的子集。

I have a data table with one string column. I'd like to create another column that is a subset of this column using strsplit.

dat <- data.table(labels=c('a_1','b_2','c_3','d_4'))

我想要的输出是

label  sub_label
a_1    a
b_2    b
c_3    c
d_4    d

我尝试了以下方法，但似乎都没有用。

I've tried the followings but neither seems to work.

dat %>%
    mutate(
        sub_labels=strsplit(as.character(labels), "_")[[1]][1]
    )
# gives a column whose values are all "a"

这对我来说似乎很合理，

this one, which seems logical to me,

dat %>%
    mutate(
        sub_labels=sapply(strsplit(as.character(labels), "_"), function(x) x[[1]][1])
    )

给出错误

我看到了另一篇文章，其中strsplit的输出上的粘贴折叠起作用了，所以我不明白为什么匿名函数中的子集会产生问题。谢谢您的解释。

I saw another post where paste-collapse on the output from strsplit worked so I don't understand why subsetting in an anonymous function is giving issues. Thanks for any elucidation on this.

推荐答案

可以在此处提供帮助：

tidyr::separate can help here:

> dat %>% separate(labels, c("first", "second") )
   first second
1:     a      1
2:     b      2
3:     c      3
4:     d      4

这篇关于在dplyr和mutate中使用strsplit和subset的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！