问题描述
我正在 regex101.com
正则表达式:^\+([0-9A-Za-z-]+)(?:\.([0-9A-Za-z-]+))*$
测试字符串:+beta-bar.baz-bz.fd.zz
字符串匹配,但匹配信息"框显示只有两个捕获组:
The string matches, but the "match information" box shows that there are only two capture groups:
匹配 11. [1-9] `beta-bar`2. [20-22]`zz`
我期待所有这些捕获:
- 测试栏
- baz-bz
- fd
- zz
为什么没有将期间之间的每个标识符识别为自己的捕获组?
Why didn't each identifier between periods get recognized as its own captured group?
推荐答案
发生这种情况的原因是因为当在捕获组上使用量词并且它被捕获 n 次时,只有最后一次被捕获文本存储在缓冲区中并在最后返回.
The reason why that happens is because when using a quantifier on a capture group and it is captured n times, only the last captured text gets stored in the buffer and returned at the end.
你可以用一个简单的正则表达式[+.]
来preg_split
你拥有的字符串,而不是匹配这些部分:
Instead of matching those parts, you can preg_split
the string you have with a simple regex [+.]
:
$str = "+beta-bar.baz-bz.fd.zz";
$a = preg_split('/[+.]/', $str, -1, PREG_SPLIT_NO_EMPTY);
参见 IDEONE 演示
结果:
Array
(
[0] => beta-bar
[1] => baz-bz
[2] => fd
[3] => zz
)
这篇关于为什么每个匹配项都不重复此正则表达式捕获组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!