本文介绍了贪婪 vs. 不情愿 vs. 占有资格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我发现了这个关于正则表达式的教程,而我直观地理解什么是贪婪"、不情愿"和占有"预选赛确实如此,我的理解似乎存在严重漏洞.

具体来说,在下面的例子中:

输入你的正则表达式:.*foo//Greedy qualifier输入要搜索的输入字符串:xfooxxxxxxfoo我找到了文本xfooxxxxxxfoo"从索引 0 开始,到索引 13 结束.输入您的正则表达式:.*?foo//不情愿的限定符输入要搜索的输入字符串:xfooxxxxxxfoo我找到了文本xfoo"从索引 0 开始,到索引 4 结束.我找到了文本xxxxxxfoo"从索引 4 开始,到索引 13 结束.输入您的正则表达式:.*+foo//占有限定符输入要搜索的输入字符串:xfooxxxxxxfoo未找到匹配项.

解释提到整个输入字符串,字母被消耗,匹配器后退,最右边出现的foo"已经反流

不幸的是,尽管有很好的比喻,但我仍然不明白什么被谁吃掉了......你知道另一个教程(简洁地)解释了正则表达式引擎是如何工作的?>

或者,如果有人能用稍微不同的措辞解释以下段落,我们将不胜感激:

第一个示例使用贪婪量词 .* 来查找anything",零次或多次,后跟字母 f"o"o".因为量词是贪婪的,表达式的 .* 部分首先吃掉整个输入字符串.此时,整体表达式无法成功,因为最后三个字母("f", "o", "o">) 已经被[谁?] 消耗掉了.因此,匹配器慢慢地 [从右到左?] 一个字母后退,直到最右边出现的 foo" 被反刍 [这是什么意思?],此时匹配成功,搜索结束.

然而,第二个例子是不情愿的,所以它首先消耗[由谁?]没有".因为 "foo" 没有出现在字符串的开头,它被迫吞下 [whosws?] 第一个字母(一个 x"),这会在 0 和 4 处触发第一次匹配.我们的测试工具会继续该过程,直到输入字符串用完为止.它在 4 和 13 处找到另一个匹配项.

第三个示例找不到匹配项,因为量词是所有格.在这种情况下,整个输入字符串都被 .*+ [how?] 使用,没有留下任何东西来满足foo"在表达式的末尾.在你想抓住所有东西而不退缩的情况下使用所有格量词[退避是什么意思?];在没有立即找到匹配的情况下,它将优于等效的贪婪量词.

解决方案

我会试一试.

贪婪量词首先尽可能匹配.所以 .* 匹配整个字符串.然后匹配器尝试匹配后面的 f,但没有剩余字符.因此它回溯",使贪婪量词少匹配一个字符(使字符串末尾的o"不匹配).这仍然与正则表达式中的 f 不匹配,因此它又回溯了一步,使贪婪量词再次匹配少一个字符(使字符串末尾的oo"不匹配).仍然与正则表达式中的 f 不匹配,因此它又回溯了一步(使字符串末尾的foo"不匹配).现在,匹配器最终匹配正则表达式中的 f,并且 o 和下一个 o 也匹配.成功!

不情愿或非贪婪"量词首先尽可能少地匹配.所以 .* 一开始什么都不匹配,整个字符串不匹配.然后匹配器尝试匹配后面的 f ,但字符串的未匹配部分以x"开头,因此不起作用.所以匹配器回溯,使非贪婪量词再匹配一个字符(现在它匹配x",留下fooxxxxxxfoo"不匹配).然后它尝试匹配成功的 f,以及正则表达式中的 o 和下一个 o.成功!

在您的示例中,它然后按照相同的过程使用字符串中剩余的未匹配部分xxxxxxfoo"重新开始该过程.

所有格 量词就像贪婪量词一样,但它不会回溯.所以它以 .* 匹配整个字符串开始,没有不匹配的.那么它就没有什么可以与正则表达式中的 f 匹配的了.由于所有格量词不会回溯,因此匹配失败.

I found this tutorial on regular expressions and while I intuitively understand what "greedy", "reluctant" and "possessive" qualifiers do, there seems to be a serious hole in my understanding.

Specifically, in the following example:

Enter your regex: .*foo // Greedy qualifier
Enter input string to search: xfooxxxxxxfoo
I found the text "xfooxxxxxxfoo" starting at index 0 and ending at index 13.

Enter your regex: .*?foo // Reluctant qualifier
Enter input string to search: xfooxxxxxxfoo
I found the text "xfoo" starting at index 0 and ending at index 4.
I found the text "xxxxxxfoo" starting at index 4 and ending at index 13.

Enter your regex: .*+foo // Possessive qualifier
Enter input string to search: xfooxxxxxxfoo
No match found.

The explanation mentions eating the entire input string, letters been consumed, matcher backing off, rightmost occurrence of "foo" has been regurgitated, etc.

Unfortunately, despite the nice metaphors, I still don't understand what is eaten by whom... Do you know of another tutorial that explains (concisely) how regular expression engines work?

Alternatively, if someone can explain in somewhat different phrasing the following paragraph, that would be much appreciated:

解决方案

I'll give it a shot.

A greedy quantifier first matches as much as possible. So the .* matches the entire string. Then the matcher tries to match the f following, but there are no characters left. So it "backtracks", making the greedy quantifier match one less character (leaving the "o" at the end of the string unmatched). That still doesn't match the f in the regex, so it backtracks one more step, making the greedy quantifier match one less character again (leaving the "oo" at the end of the string unmatched). That still doesn't match the f in the regex, so it backtracks one more step (leaving the "foo" at the end of the string unmatched). Now, the matcher finally matches the f in the regex, and the o and the next o are matched too. Success!

A reluctant or "non-greedy" quantifier first matches as little as possible. So the .* matches nothing at first, leaving the entire string unmatched. Then the matcher tries to match the f following, but the unmatched portion of the string starts with "x" so that doesn't work. So the matcher backtracks, making the non-greedy quantifier match one more character (now it matches the "x", leaving "fooxxxxxxfoo" unmatched). Then it tries to match the f, which succeeds, and the o and the next o in the regex match too. Success!

In your example, it then starts the process over with the remaining unmatched portion of the string, "xxxxxxfoo", following the same process.

A possessive quantifier is just like the greedy quantifier, but it doesn't backtrack. So it starts out with .* matching the entire string, leaving nothing unmatched. Then there is nothing left for it to match with the f in the regex. Since the possessive quantifier doesn't backtrack, the match fails there.

这篇关于贪婪 vs. 不情愿 vs. 占有资格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-18 10:00