问题描述
我目前正在学习 lua.关于 lua 中的模式匹配,我在 lua.org 上的 lua 文档中发现了以下句子:
I'm currently learning lua. regarding pattern-matching in lua I found the following sentence in the lua documentation on lua.org:
尽管如此,Lua 中的模式匹配是一个强大的工具,并且包含一些难以与标准 POSIX 实现匹配的功能.
因为我熟悉 posix 正则表达式,所以我想知道是否有任何常见的示例,其中 lua 模式匹配与正则表达式相比更好"——或者我是否误解了这句话?如果有任何常见的例子:为什么模式匹配与正则表达式更适合?
As I'm familiar with posix regular expressions I would like to know if there are any common samples where lua pattern matching is "better" compared to regular expression -- or did I misinterpret the sentence? and if there are any common examples: why is any of pattern-matching vs. regular expressions better suited?
推荐答案
与其说是特殊示例,不如说是 Lua 模式比 POSIX 正则表达式具有更高的信噪比.通常更可取的是整体设计,而不是特定示例.
It is not so much particular examples as that Lua patterns have a higher signal-to-noise ratio than POSIX regular expressions. It is the overall design that is often preferable, not particular examples.
以下是促成良好设计的一些因素:
Here are some factors that contribute to the good design:
用于匹配常见字符类型的非常轻量级的语法,包括大写字母 (
%u
)、十进制数字 (%d
)、空格字符 (%s
) 等.任何字符类型都可以使用相应的大写字母来补充,因此模式%S
匹配任何非空格字符.
Very lightweight syntax for matching common character types including uppercase letters (
%u
), decimal digits (%d
), space characters (%s
) and so on. Any character type can be complemented by using the corresponding capital letter, so pattern%S
matches any nonspace character.
引用是非常简单和有规律的.引用字符是 %
,所以它总是与字符串引用字符 不同,这使得 Lua 模式比 POSIX 正则表达式更容易阅读(当需要引用时)).引用符号总是安全的,而且从来没有必要引用字母,因此您可以按照经验法则行事,而不是记住哪些符号是特殊元字符.
Quoting is extremely simple and regular. The quoting character is %
, so it is always distinct from the string-quoting character , which makes Lua patterns much easier to read than POSIX regular expressions (when quoting is necessary). It is always safe to quote symbols, and it is never necessary to quote letters, so you can just go by that rule of thumb instead of memorizing what symbols are special metacharacters.
Lua 提供捕获"并且可以作为 match
调用的结果返回多个捕获.这个接口比通过副作用捕获子字符串或具有一些必须被询问才能找到捕获的隐藏状态要好得多.捕获语法很简单:只需使用括号即可.
Lua offers "captures" and can return multiple captures as the result of a match
call. This interface is much, much better than capturing substrings through side effects or having some hidden state that has to be interrogated to find captures. Capture syntax is simple: just use parentheses.
Lua 有一个最短匹配"-
修饰符与最长匹配"*
运算符一起使用.因此,例如 s:find '%s(%S-)%.'
查找以空格开头并后跟一个点的最短非空格字符序列.
Lua has a "shortest match" -
modifier to go along with the "longest match" *
operator. So for example s:find '%s(%S-)%.'
finds the shortest sequence of nonspace characters that is preceded by space and followed by a dot.
Lua 模式的表达能力可与 POSIX基本"正则表达式相媲美,无需替换运算符 |
.您放弃的是带有 |
的扩展"正则表达式.如果您需要如此强大的表现力,我建议您一直使用 LPEG这基本上以相当合理的成本为您提供了上下文无关语法的强大功能.
The expressive power of Lua patterns is comparable to POSIX "basic" regular expressions, without the alternation operator |
. What you are giving up is "extended" regular expressions with |
. If you need that much expressive power I recommend going all the way to LPEG which gives you essentially the power of context-free grammars at quite reasonable cost.
这篇关于Lua 模式匹配与正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!