本文介绍了[[:&gt ;:]]或[[:&lt ;:]]不匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在正则表达式中使用[[:>:]],但是在其他字符类(例如[[:digit:]][[:word:]]是.怎么了?

I'm trying to use [[:>:]] in my regex but they are not accepted while other character classes e.g. [[:digit:]] or [[:word:]] are. What's going wrong?

在线演示

推荐答案

这是一个错误,因为这些构造(开始的单词边界,[[:<:]]和结束的[[:>:]]单词边界) PCRE库本身支持:

It is a bug, because these constructs (starting word boundary, [[:<:]], and ending [[:>:]] word boundary) are supported by the PCRE library itself:

COMPATIBILITY FEATURE FOR WORD BOUNDARIES

  In  the POSIX.2 compliant library that was included in 4.4BSD Unix, the
  ugly syntax [[:<:]] and [[:>:]] is used for matching  "start  of  word"
  and "end of word". PCRE treats these items as follows:

    [[:<:]]  is converted to  \b(?=\w)
    [[:>:]]  is converted to  \b(?<=\w)

  Only these exact character sequences are recognized. A sequence such as
  [a[:<:]b] provokes error for an unrecognized  POSIX  class  name.  This
  support  is not compatible with Perl. It is provided to help migrations
  from other environments, and is best not used in any new patterns. Note
  that  \b matches at the start and the end of a word (see "Simple asser-
  tions" above), and in a Perl-style pattern the preceding  or  following
  character  normally  shows  which  is  wanted, without the need for the
  assertions that are used above in order to give exactly the  POSIX  be-
  haviour.

在PHP代码中使用时,它可以工作:

When used in PHP code, it works:

if (preg_match_all('/[[:<:]]home[[:>:]]/', 'homeless and home', $m))
{
    print_r($m[0]);
}

找到Array ( [0] => home).请参见在线PHP演示.

因此,是regex101.com开发人员团队决定(或忘记了)包括对这些成对单词边界的支持.

在regex101.com 上,使用所有四个regex101.com regex引擎(PCRE,JS,Python和Go)都支持的\b单词边界(分别作为开始和结束)

At regex101.com, instead, use \b word boundaries (both as starting and ending ones) that are supported by all 4 regex101.com regex engines: PCRE, JS, Python and Go.

类似POSIX的引擎主要支持这些单词边界,例如,请参见 PostgreSQL regex演示. [[:<:]]HR[[:>:]]正则表达式在Head of HR中找到匹配项,但在<A HREF="some.htmlCHROME中找不到匹配项.

These word boundaries are mostly supported by POSIX-like engines, see this PostgreSQL regex demo, for example. The [[:<:]]HR[[:>:]] regex finds a match in Head of HR, but finds no match in <A HREF="some.html and CHROME.

其他支持[[:<:]][[:>:]]字边界的正则表达式引擎是基数R(例如,没有perl=TRUE参数的gsub)和MySQL.

Other regex engines that support [[:<:]] and [[:>:]] word boundaries are base R (gsub with no perl=TRUE argument, e.g.) and MySQL.

在Tcl正则表达式中,对于[[:<:]](起始单词边界)有\m,对于结束单词边界([[:>:]])有\M.

In Tcl regex, there is \m for [[:<:]] (starting word boundary) and \M for ending word boundary ([[:>:]]).

这篇关于[[:&gt ;:]]或[[:&lt ;:]]不匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-22 16:12