NET正则表达式中的锚点

NET正则表达式中的锚点

本文介绍了.NET正则表达式中的锚点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是有关问题检查字符串以查看所有字符是否均为十六进制值的问题.

建议的正则表达式如下:

The proposed regular expression is the following:

\A\b[0-9a-fA-F]+\b\Z

现在, \ A \ Z 似乎分别等效于 ^ $ . \ Z 的行为有所不同,因为匹配时允许在其后添加换行符(这可能会或可能不会这样做).

Now, \A and \Z seem to be the equivalent to ^ and $ respectively. \Z behaves differently, in that it allows a newline after it when matching (this might or might not be intended).

我不明白的是为什么使用 \ b 在单词边界匹配"锚.字符串的开头/结尾不总是单词边界吗?

What I don't understand is why the \b "match at word boundary" anchor is used. Isn't the beginning/end of a string always a word boundary?

最终,该正则表达式可以用相同的行为重写为 ^ [0-9a-fA-F] $ (忽略尾随的 \ n 问题).我想念什么吗?在某些奇怪的情况下是否需要使用 \ b ?

Ultimately, the regex could be rewritten as ^[0-9a-fA-F]$ with the same behavior (ignoring the trailing \n issue). Am I missing something? Is using \b required for some weird edge case?

测试用例:

123ABC -> true
123def -> Returns true
123g -> Returns false

推荐答案

单词边界 \ b 在非单词和单词字符之间进行匹配,如果第一个字符是单词字符,则在字符串的开头,如果最后一个字符则在字符串的末尾进行匹配字符是单词字符.

The word boundary \b matches between non-word and word characters, and also at the start of the string if the first character is a word character, and at the end if the last character is a word character.

因此, \ A \ b [0-9a-fA-F] + \ b \ Z 等于 \ A [0-9a-fA-F] + \ Z,因为字符串中的所有字符都必须是单词字符( [0-9] 数字或 [a-fA-F] 字母)匹配的模式.

Thus, \A\b[0-9a-fA-F]+\b\Z is equal to \A[0-9a-fA-F]+\Z because all the characters in the string must be word characters ([0-9] digits or [a-fA-F] letters) for the pattern to match it.

在这种情况下,情况将有所不同: \ A \ b [0-9a-fA-F-] + \ b \ Z 仅匹配开头带有单词字符的字符串然后结束.

It would be a different story in this case: \A\b[0-9a-fA-F-]+\b\Z that would only match strings with word characters at the beginning and end.

使用 \ z 匹配整个字符串,最后不允许使用 \ n .

Use \z to match a whole string, with no \n allowed at the end.

这篇关于.NET正则表达式中的锚点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-14 22:14