问题描述
我见过许多使用空格处理的 ANTLR 语法,如下所示:
I have seen many ANTLR grammars that use whitespace handling like this:
WS: [ \n\t\r]+ -> skip;
// or
WS: [ \n\t\r]+ -> channel(HIDDEN);
所以空格被扔掉分别发送到隐藏通道.
So the whitespaces are thrown away respectively send to the hidden channel.
使用这样的语法:
grammar Not;
start: expression;
expression: NOT expression
| (TRUE | FALSE);
NOT: 'not';
TRUE: 'true';
FALSE: 'false';
WS: [ \n\t\r]+ -> skip;
有效输入是not true"或not false",但也是nottrue",这不是预期的结果.将语法更改为:
valid inputs are 'not true' or 'not false' but also 'nottrue' which is not a desired result.Changing the grammar to:
grammar Not;
start: expression;
expression: NOT WS+ expression
| (TRUE | FALSE);
NOT: 'not';
TRUE: 'true';
FALSE: 'false';
WS: [ \n\t\r];
解决了问题,但我不想在每个规则中手动处理空格.
fixes the problem, but i do not want to handle the whitespaces manually in each rule.
通常我希望在每个标记之间有一个空格,但有一些例外(例如,'!true' 之间不需要空格).
Generally i want to have a whitespace between each token with some exceptions (e.g. '!true' does not need a whitespace in between).
有没有简单的方法可以做到这一点?
Is there a simple way of doing this?
推荐答案
添加 IDENTIFIER
词法分析器规则来处理不是关键字的词.
Add an IDENTIFIER
lexer rule to handle words which are not keywords.
IDENTIFIER : [a-zA-Z]+;
现在文本 nottrue
是单个 IDENTIFIER
标记,您的解析器不会接受它来代替 not true
中的不同关键字.
Now the text nottrue
is a single IDENTIFIER
token which your parser would not accept in place of the distinct keywords in not true
.
确保 IDENTIFIER
定义在您的其他关键字之后.词法分析器会发现 NOT
和 IDENTIFIER
都与文本 not
匹配,并将标记类型分配给出现在语法中的第一个.
Make sure IDENTIFIER
is defined after your other keywords. The lexer will find that both NOT
and IDENTIFIER
match the text not
, and will assign the token type to the first one that appears in the grammar.
这篇关于ANTLR4:空白处理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!