regex - 为什么我的非贪婪的Perl正则表达式仍然匹配太多？

说，我一行包含以下字符串:

"$tom" said blah blah blash.  "$dick" said "blah blah blah". "$harry" said blah blah blah.

and I want to extract

"$dick" said "blah blah blah"

I have the following code:

my ($term) = /(".+?" said ".+?")/g;
print $term;

但这给了我更多的需求:

"$tom" said blah blah blash.  "$dick" said "blah blah blah"

I tried grouping my pattern as a whole by using the non-capturing parens:

my ($term) = /((?:".+?" said ".+?"))/g;

但是问题仍然存在。

我已经重新阅读了学习Perl的“非贪婪量词”部分，但是到目前为止，我仍然无所适从。

感谢您的慷慨提供的指导:)

最佳答案

问题是，即使它不是贪婪的，它仍然继续尝试。正则表达式看不到

"$tom" said blah blah blash.

并认为“哦，“所说”后面的内容没有被引用，因此我将跳过该内容。”它认为“好吧，“所说”之后的内容没有被引用，因此它仍必须是我们引用的一部分。”所以".+?"匹配

"$tom" said blah blah blash.  "$dick"

您想要的是"[^"]+"。这将匹配两个引号，其中包含非引号。因此，最终的解决方案是:

("[^"]+" said "[^"]+")

关于regex - 为什么我的非贪婪的Perl正则表达式仍然匹配太多？，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/1598946/