问题描述
我想获取< tag>< / tag>
标记对之间的任何值的内容。
I would like to grab the contents of any value between pairs of <tag></tag>
tags.
<tag>
This is one block of text
</tag>
<tag>
This is another one
</tag>
我想出的正则表达式是
/< tag>(。*)< / tag> / m
虽然它出现了要贪婪并且在最后的< / tag>
中捕获括号内的所有内容。我希望它尽可能地懒惰,以便每当它看到一个结束标记时,它会将其视为一个匹配组并重新开始。
Though, it appears to be greedy and is capturing everything within the enclosed parentheses up until the very last </tag>
. I would like it to be as lazy as possible so that everytime it sees a closing tag, it will treat that as a match group and start over.
我怎么写正则表达式,以便我能够在给定的场景中获得多个匹配?
How can I write the regex so that I will be able to get multiple matches in the given scenario?
我在下面的链接中包含了我所描述的样本
I have included a sample of what I am describing in the following link
注意:这不是XML,也不是基于任何现有的标准格式。我不需要任何复杂的东西,比如一个带有一个很好的解析器的完整库。
Note: This is not XML, nor is it really based on any existing standard format. I won't need anything sophisticated like a full-fledged library that comes with a nice parser.
推荐答案
使用正则表达式模式:
Go with regex pattern:
/<tag>(.*?)<\/tag>/im
懒惰(非贪婪)是。*?
,而不是。*
。
Lazy (non-greedy) is .*?
, not .*
.
要查找多次出现,请使用:
To find multiple occurrences, use:
string.scan(/<tag>(.*?)<\/tag>/im)
这篇关于懒惰(ungreedy)使用正则表达式匹配多个组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!