问题描述
经过一些研究,我发现使用正则表达式解析递归结构(例如 HTML 或 XML)是不可能的.是否可以全面列出我应该避免使用正则表达式的日常编码场景,因为使用正则表达式完成特定任务是不可能的?假设有问题的正则表达式引擎不是 PCRE.
After some research I figured that it is not possible to parse recursive structures (such as HTML or XML) using regular expressions. Is it possible to comprehensively list out day to day coding scenarios where I should avoid using regular expressions because it is just impossible to do that particular task using regular expressions? Let us say the regex engine in question is not PCRE.
推荐答案
在以下情况下不要使用正则表达式:
Don't use regular expressions when:
- 您尝试解析的语言不是常规语言,或
- 当有专门为您尝试解析的数据制作的现成解析器时.
使用正则表达式解析 HTML 和 XML 通常是一个坏主意,因为它们不是正则语言,并且因为已经存在可以为您解析它们的库.
Parsing HTML and XML with regular expressions is usually a bad idea both because they are not regular languages and because libraries already exist that can parse it for you.
再举一个例子,如果你需要检查一个整数是否在 0-255 范围内,如果你使用你语言的库函数将它解析为一个整数然后检查它的数值而不是试图写出匹配这个范围的正则表达式.
As another example, if you need to check if an integer is in the range 0-255, it's easier to understand if you use your language's library functions to parse it to an integer and then check its numeric value instead of trying to write the regular expression that matches this range.
这篇关于什么时候不应该使用正则表达式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!