ANTLR4:空白处理

本文介绍了ANTLR4:空白处理的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我见过许多使用空格处理的 ANTLR 语法，如下所示:

I have seen many ANTLR grammars that use whitespace handling like this:

WS: [ \n\t\r]+ -> skip;
// or
WS: [ \n\t\r]+ -> channel(HIDDEN);

所以空格被扔掉分别发送到隐藏通道.

So the whitespaces are thrown away respectively send to the hidden channel.

使用这样的语法:

grammar Not;

start:      expression;
expression: NOT expression
          | (TRUE | FALSE);

NOT:    'not';
TRUE:   'true';
FALSE:  'false';
WS: [ \n\t\r]+ -> skip;

有效输入是not true"或not false"，但也是nottrue"，这不是预期的结果.将语法更改为:

valid inputs are 'not true' or 'not false' but also 'nottrue' which is not a desired result.Changing the grammar to:

grammar Not;

start:      expression;

expression: NOT WS+ expression
          | (TRUE | FALSE);

NOT:    'not';

TRUE:   'true';
FALSE:  'false';

WS: [ \n\t\r];

解决了问题，但我不想在每个规则中手动处理空格.

fixes the problem, but i do not want to handle the whitespaces manually in each rule.

通常我希望在每个标记之间有一个空格，但有一些例外(例如，'!true' 之间不需要空格).

Generally i want to have a whitespace between each token with some exceptions (e.g. '!true' does not need a whitespace in between).

有没有简单的方法可以做到这一点?

Is there a simple way of doing this?

推荐答案

添加 IDENTIFIER 词法分析器规则来处理不是关键字的词.

Add an IDENTIFIER lexer rule to handle words which are not keywords.

IDENTIFIER : [a-zA-Z]+;

现在文本 nottrue 是单个 IDENTIFIER 标记，您的解析器不会接受它来代替 not true 中的不同关键字.

Now the text nottrue is a single IDENTIFIER token which your parser would not accept in place of the distinct keywords in not true.

确保 IDENTIFIER 定义在您的其他关键字之后.词法分析器会发现 NOT 和 IDENTIFIER 都与文本 not 匹配，并将标记类型分配给出现在语法中的第一个.

Make sure IDENTIFIER is defined after your other keywords. The lexer will find that both NOT and IDENTIFIER match the text not, and will assign the token type to the first one that appears in the grammar.

这篇关于ANTLR4:空白处理的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！