问题描述
我希望为一个暑期项目编写自己的语法荧光笔,但我不知道如何编写自己的语法荧光笔.
I was hoping to write my own syntax highlighter for a summer project I am thinking of working on but I am not sure how to write my own syntax highlighter.
我知道有很多实现,但我想了解正则表达式以及语法高亮的工作原理.
I know that there are bunch of implementations out there but I would like to learn about regular expressions and how syntax highlighting works.
语法高亮是如何工作的?开发高亮有哪些好的参考?语法高亮器是在键入时扫描每个字符还是在键入每个字符后扫描整个文档/文本区域?
How does syntax highlighting work and what are some good references for developing one? Does the syntax highlighter scan each character as it is typed or does it scan the document/text area as a whole after each character is typed?
任何见解将不胜感激.
谢谢.
PS:我打算用 ActionScript 编写
PS: I was planning on writing it in ActionScript
推荐答案
语法高亮可以以两种非常通用的方式工作.第一个为突出显示的语言实现了一个完整的词法分析器和解析器,准确识别每个标记的类型(关键字、类名、实例名、变量类型、预处理器指令......).这提供了根据某些规范准确突出显示代码所需的所有信息(红色的关键字,蓝色的类名,你有什么).
Syntax highlighters can work in two very general ways. The first implements a full lexer and parser for the language(s) being highlighted, exactly identifying each token's type (keyword, class name, instance name, variable type, preprocessor directive...). This provides all the information needed to exactly highlight the code according to some specification (keywords in red, class names in blue, what have you).
第二种方式类似于 Google Code Prettify 所采用的方式,而不是实施一种方式对于每种语言的词法分析器/解析器,使用了几个非常通用的解析器,它们可以在大多数语法上做得很好.例如,这个高亮器将能够合理地解析和高亮任何类似 C 的语言,因为它的词法分析器/解析器可以识别这些类型语言的一般组件.
The second way is something like the one Google Code Prettify employs, where instead of implementing one lexer/parser per language, a couple of very general parsers are used that can do a decent job on most syntaxes. This highlighter, for example, will be able to parse and highlight reasonably well any C-like language, because its lexer/parser can identify the general components of those kinds of languages.
这也有一个优点,因此,您不需要明确指定语言,因为引擎将自行确定其通用解析器中的哪些可以做得最好.当然,缺点是突出显示不如使用特定于语言的解析器时完美.
This also has the advantage that, as a result, you don't need to explicitely specify the language, as the engine will determine by itself which of its generic parsers can do the best job. The downside of course is that highlighting is less perfect than when a language-specific parser is used.
这篇关于编写语法高亮的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!