本文介绍了为什么每个匹配项都不重复此正则表达式捕获组?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在 regex101.com

正则表达式:^\+([0-9A-Za-z-]+)(?:\.([0-9A-Za-z-]+))*$

测试字符串:+beta-bar.baz-bz.fd.zz

字符串匹配,但匹配信息"框显示只有两个捕获组:

The string matches, but the "match information" box shows that there are only two capture groups:

匹配 11. [1-9] `beta-bar`2. [20-22]`zz`

我期待所有这些捕获:

  1. 测试栏
  2. baz-bz
  3. fd
  4. zz

为什么没有将期间之间的每个标识符识别为自己的捕获组?

Why didn't each identifier between periods get recognized as its own captured group?

推荐答案

发生这种情况的原因是因为当在捕获组上使用量词并且它被捕获 n 次时,只有最后一次被捕获文本存储在缓冲区中并在最后返回.

The reason why that happens is because when using a quantifier on a capture group and it is captured n times, only the last captured text gets stored in the buffer and returned at the end.

你可以用一个简单的正则表达式[+.]preg_split你拥有的字符串,而不是匹配这些部分:

Instead of matching those parts, you can preg_split the string you have with a simple regex [+.]:

$str = "+beta-bar.baz-bz.fd.zz";
$a = preg_split('/[+.]/', $str, -1, PREG_SPLIT_NO_EMPTY);

参见 IDEONE 演示

结果:

Array
(
    [0] => beta-bar
    [1] => baz-bz
    [2] => fd
    [3] => zz
)

这篇关于为什么每个匹配项都不重复此正则表达式捕获组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-27 06:09