问题描述
(注意:标题似乎不清楚-如果有人可以改写我的意思,就是这样!)
(Note: Title doesn't seem to clear -- if someone can rephrase this I'm all for it!)
鉴于此正则表达式:(.*_e\.txt)
(与某些文件名匹配),除了e
外,我还需要添加一些其他单字符后缀.我应该选择字符类还是为此使用替代字符? (还是真的很重要吗?)
Given this regex: (.*_e\.txt)
, which matches some filenames, I need to add some other single character suffixes in addition to the e
. Should I choose a character class or should I use an alternation for this? (Or does it really matter??)
也就是说,以下两个中的哪一个看起来更好",以及原因:
That is, which of the following two seems "better", and why:
a)(.*(e|f|x)\.txt)
或
b)(.*[efx]\.txt)
推荐答案
使用[efx]
-这正是字符类的设计目的:匹配其中一个字符.因此,它也是最易读和最短的解决方案.
Use [efx]
- that's exactly what character classes are designed for: to match one of the included characters. Therefore it's also the most readable and shortest solution.
我不知道它是否更快,但是如果不是更快的话,我会感到非常惊讶.绝对不会慢.
I don't know if it's faster, but I would be very much surprised if it wasn't. It definitely won't be slower.
我的推理(从来没有编写过正则表达式引擎,所以这纯粹是推测):
My reasoning (without ever having written a regex engine, so this is pure conjecture):
正则表达式令牌[abc]
将在正则表达式引擎的单个步骤中应用:下一个字符是a
,b
还是c
之一?"
The regex token [abc]
will be applied in a single step of the regex engine: "Is the next character one of a
, b
, or c
?"
(a|b|c)
但是告诉正则表达式引擎
(a|b|c)
however tells the regex engine to
- 记住字符串中的当前位置以进行回溯
- 检查是否可以匹配
a
.如果是这样,那就成功了.如果没有: - 检查是否可以匹配
b
.如果是这样,那就成功了.如果没有: - 检查是否可以匹配
c
.如果是这样,那就成功了.如果没有: - 放弃.
- remember the current position in the string for backtracking, if necessary
- check if it's possible to match
a
. If so, success. If not: - check if it's possible to match
b
. If so, success. If not: - check if it's possible to match
c
. If so, success. If not: - give up.
这篇关于使用交替或字符类进行单个字符匹配?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!