问题描述
我的任务是在只有数字的字符串中找到所有连续的数字.但是,我不是在寻找更好的正则表达式来执行此操作,而是在寻找匹配子字符串的正确正则表达式.
My task is to find all consecutive number in a string of only numbers. However I am not searching for a better regex to do this, but for a correct regex of matching substrings.
这就是我构建正则表达式的方式:
This is how I build my regex:
$regex = "";
for($i=0;$i<10;$i++) {
$str = "";
for($a=0;$a<10;$a++) {
if($a > $i) {
$str .= $a;
if(strlen($str)>1) {
$regex .= "|".$str."";
}
}
}
}
$myregex = "/".ltrim($regex,"|")."/";
echo $myregex;
结果:
/12|123|1234|12345|123456|1234567|12345678|123456789|23|234|2345|23456|234567|2345678|23456756|34675|34675|346|3456756545678464567|45678|456789|56|567|5678|56789|67|678|6789|78|789|89/
然后我做:
$literal = '234121678941251236544567812122345678';
$matches = [];
preg_match_all($myregex,$literal,$matches);
var_dump($matches);
结果:
array(1) {
[0]=>
array(13) {
[0]=>
string(2) "23"
[1]=>
string(2) "12"
[2]=>
string(2) "67"
[3]=>
string(2) "89"
[4]=>
string(2) "12"
[5]=>
string(2) "12"
[6]=>
string(2) "45"
[7]=>
string(2) "67"
[8]=>
string(2) "12"
[9]=>
string(2) "12"
[10]=>
string(2) "23"
[11]=>
string(2) "45"
[12]=>
string(2) "67"
}
}
但是我想找到所有出现的子字符串(而不是在匹配后转到下一个字符) - 比如:
However I want to find all substrings occuring (and not go to the next chars after a match) - like:
23,234,34,12,67,678,6789,78,789,89,12, ...
但是,我尝试了不同的带有括号、+、...的变体,但没有找出正确的正则表达式来查找所有匹配项(抱歉,仍然是一个正则表达式菜鸟).如何修改正则表达式?
However I have tried different variatons with brackets, +, ... and did not figure out the correct regex to find all matches (sorry, still bit of a regex noob). How do I have to modify the regular expression?
推荐答案
正则表达式的顺序很重要.我不确定这是否完全解决了问题,这样做的方法可能存在根本性缺陷,但您可以试试这个:
The order of the regex is important. I'm not sure if this fully solves the issue the method of doing it this way may be fundamentally flawed but you can try this:
$regex = [];
for($i=0;$i<10;$i++) {
$str = "";
for($a=0;$a<10;$a++) {
if($a > $i) {
$str .= $a;
if(strlen($str)>1) {
$regex[] = $str;
}
}
}
}
usort($regex, function($a,$b){
return strlen($b) <=> strlen($a);
});
$myregex = '/'.implode('|', $regex).'/';
它所做的是将数字序列变成一个数组,然后按长度对它们进行排序,并首先将它们排序为最长的序列.最终结果是这样的(匹配后)
What it does is make the number sequences an array, then it sorts them by length and orders them the longest sequences first. The end result is this (after matching)
array(1) {
[0]=>
array(9) {
[0]=>
string(3) "234"
[1]=>
string(2) "12"
[2]=>
string(4) "6789"
[3]=>
string(2) "12"
[4]=>
string(3) "123"
[5]=>
string(5) "45678"
[6]=>
string(2) "12"
[7]=>
string(2) "12"
[8]=>
string(7) "2345678"
}
}
还要注意飞船操作符 只适用于 PHP7+
Also note the spaceship operator <=>
only works in PHP7+
希望有帮助.
并且在匹配后不转到下一个字符
我认为使用正则表达式是不可能的,如果你的意思是你想在 23
234
2345
中一次性找到 23
234
例如代码>2345607.然而,如果它匹配一个长序列,那么它只能在逻辑上匹配一个较短的序列.所以你可以剪掉右边的数字直到长度为 2 并得到匹配.
I don't think this is possible with regex, if you mean you want to find 23
234
2345
all at once in 2345607
for example. However if it matches a long sequence it only stands to reason that it must match a shorter one, logically. So you could just trim off the right hand number until the length is 2 and get the matches.
这篇关于PHP:preg_match_all() - 如何使用正则表达式正确查找所有出现的或分隔的子字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!