问题描述
POSIX 表达式让我头疼.
POSIX Expression is giving me a headache.
假设我们有一个字符串:
Lets say we have a string:
a = "[question(37), question_pipe(\"Person10\")]"
最终我希望能够:
b = c("37", "Person10")
我查看了 stringr
包,但不知道如何使用正则表达式和 str_split
提取信息.
I've had a look at the stringr
package but cant figure out how to extract the information out using regular expressions and str_split
.
任何帮助将不胜感激.
卡梅隆
推荐答案
所以如果我理解正确的话,你想提取括号内的元素.
So if I understand correctly you want to extract the elements within parenthesis.
您可以首先使用 str_extract_all
提取这些元素,包括括号:
You can first extract those elements, including the parenthesis, using str_extract_all
:
b1 <- str_extract_all(string = a, pattern = "\\(.*?\\)")
b1
# [[1]]
# [1] "(37)" "(\"Person10\")"
既然 str_extract_all
返回一个列表,我们把它变成一个向量:
Since str_extract_all
returns a list, let's turn it into a vector:
b2 <- unlist(b1)
b2
# [1] "(37)" "(\"Person10\")"
最后,您可以使用str_sub
去除括号(每个字符串的第一个和最后一个字符):
Last, you can remove the parenthesis (the first and last character of each string) using str_sub
:
b3 <- str_sub(string = b2, start = 2L, end = -2L)
b3
# [1] "37" "\"Person10\""
关于正则表达式模式的一些评论:\\(
和 \\)
是您的左括号和右括号..*?
表示任何字符串但不贪婪,否则你会得到一个从第一个 (
到最后一个 )
的长匹配.
A few comments about the regex pattern: \\(
and \\)
are your opening and closing parenthesis. .*?
means any character string but without being greedy, otherwise you would get one long match from the first (
to the last )
.
这篇关于从字符串中提取数字和名称 [r]的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!