问题描述
我正在尝试在正则表达式中使用[[:>:]]
,但是在其他字符类(例如[[:digit:]]
或[[:word:]]
是.怎么了?
I'm trying to use [[:>:]]
in my regex but they are not accepted while other character classes e.g. [[:digit:]]
or [[:word:]]
are. What's going wrong?
推荐答案
这是一个错误,因为这些构造(开始的单词边界,[[:<:]]
和结束的[[:>:]]
单词边界) PCRE库本身支持:
It is a bug, because these constructs (starting word boundary, [[:<:]]
, and ending [[:>:]]
word boundary) are supported by the PCRE library itself:
COMPATIBILITY FEATURE FOR WORD BOUNDARIES
In the POSIX.2 compliant library that was included in 4.4BSD Unix, the
ugly syntax [[:<:]] and [[:>:]] is used for matching "start of word"
and "end of word". PCRE treats these items as follows:
[[:<:]] is converted to \b(?=\w)
[[:>:]] is converted to \b(?<=\w)
Only these exact character sequences are recognized. A sequence such as
[a[:<:]b] provokes error for an unrecognized POSIX class name. This
support is not compatible with Perl. It is provided to help migrations
from other environments, and is best not used in any new patterns. Note
that \b matches at the start and the end of a word (see "Simple asser-
tions" above), and in a Perl-style pattern the preceding or following
character normally shows which is wanted, without the need for the
assertions that are used above in order to give exactly the POSIX be-
haviour.
在PHP代码中使用时,它可以工作:
When used in PHP code, it works:
if (preg_match_all('/[[:<:]]home[[:>:]]/', 'homeless and home', $m))
{
print_r($m[0]);
}
找到Array ( [0] => home)
.请参见在线PHP演示.
因此,是regex101.com开发人员团队决定(或忘记了)包括对这些成对单词边界的支持.
在regex101.com 上,使用所有四个regex101.com regex引擎(PCRE,JS,Python和Go)都支持的\b
单词边界(分别作为开始和结束)
At regex101.com, instead, use \b
word boundaries (both as starting and ending ones) that are supported by all 4 regex101.com regex engines: PCRE, JS, Python and Go.
类似POSIX的引擎主要支持这些单词边界,例如,请参见 PostgreSQL regex演示. [[:<:]]HR[[:>:]]
正则表达式在Head of HR
中找到匹配项,但在<A HREF="some.html
和CHROME
中找不到匹配项.
These word boundaries are mostly supported by POSIX-like engines, see this PostgreSQL regex demo, for example. The [[:<:]]HR[[:>:]]
regex finds a match in Head of HR
, but finds no match in <A HREF="some.html
and CHROME
.
其他支持[[:<:]]
和[[:>:]]
字边界的正则表达式引擎是基数R(例如,没有perl=TRUE
参数的gsub
)和MySQL.
Other regex engines that support [[:<:]]
and [[:>:]]
word boundaries are base R (gsub
with no perl=TRUE
argument, e.g.) and MySQL.
在Tcl正则表达式中,对于[[:<:]]
(起始单词边界)有\m
,对于结束单词边界([[:>:]]
)有\M
.
In Tcl regex, there is \m
for [[:<:]]
(starting word boundary) and \M
for ending word boundary ([[:>:]]
).
这篇关于[[:> ;:]]或[[:< ;:]]不匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!