问题描述
我正在尝试更深入地研究正则表达式并希望匹配条件,除非在同一字符串中也找到了某些子字符串.我知道我可以使用两个 grepl
语句(如下所示),但我想使用一个正则表达式来测试这种情况,因为我正在推动我的理解.假设我想使用 "(dog.*man|man.*dog)"
(取自此处) 但如果字符串包含子字符串park",则不会.我想我可以使用 (*SKIP)(*FAIL)
来否定park",但这不会导致字符串失败(如下所示).
I'm trying to dig deeper into regexes and want to match a condition unless some substring is also found in the same string. I know I can use two grepl
statements (as seen below) but am wanting to use a single regex to test for this condition as I'm pushing my understanding. Let's say I want to match the words "dog" and "man" using "(dog.*man|man.*dog)"
(taken from here) but not if the string contains the substring "park". I figured I could use (*SKIP)(*FAIL)
to negate the "park" but this does not cause the string to fail (shown below).
- 如何匹配finddog"&的逻辑带有 1 个正则表达式的人"而不是公园"?
- 我对
(*SKIP)(*FAIL)|
的理解有什么问题?
代码:
x <- c(
"The dog and the man play in the park.",
"The man plays with the dog.",
"That is the man's hat.",
"Man I love that dog!",
"I'm dog tired",
"The dog park is no place for man.",
"Park next to this dog's man."
)
# Could do this but want one regex
grepl("(dog.*man|man.*dog)", x, ignore.case=TRUE) & !grepl("park", x, ignore.case=TRUE)
# Thought this would work, it does not
grepl("park(*SKIP)(*FAIL)|(dog.*man|man.*dog)", x, ignore.case=TRUE, perl=TRUE)
推荐答案
您可以使用锚定前瞻解决方案(需要 Perl 风格的正则表达式):
You can use the anchored look-ahead solution (requiring Perl-style regexp):
grepl("^(?!.*park)(?=.*dog.*man|.*man.*dog)", x, ignore.case=TRUE, perl=T)
这是一个 IDEONE 演示
^
- 将模式锚定在字符串的开头(?!.*park)
- 如果park
存在,则匹配失败(?=.*dog.*man|.*man.*dog)
- 如果man
和dog
匹配失败缺席.
^
- anchors the pattern at the start of the string(?!.*park)
- fail the match ifpark
is present(?=.*dog.*man|.*man.*dog)
- fail the match ifman
anddog
are absent.
具有 3 个前瞻的另一个版本(更具可扩展性):
Another version (more scalable) with 3 look-aheads:
^(?!.*park)(?=.*dog)(?=.*man)
这篇关于正则表达式匹配子字符串,除非另一个子字符串匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!