问题描述
如果我有以下语法来解析由空格分隔的整数列表:
If I have the following the grammar to parse a list of Integers separated by whitespace:
grammar TEST;
test
: expression* EOF
;
expression
: integerLiteral
;
integerLiteral
: INTLITERAL
;
PLUS: '+';
MINUS: '-';
DIGIT: '0'..'9';
DIGITS: DIGIT+;
INTLITERAL: (PLUS|MINUS)? DIGITS;
WS: [ \t\r\n] -> skip;
它不起作用!如果我通过100",我会得到:
It does not work! If I pass "100" I get:
line 1:0 extraneous input '100' expecting {<EOF>, INTLITERAL}
但是,如果删除词法分析器 INTLITERAL 规则并将其放在解析器规则 integerLiteral 下,就像这样
However if remove the lexer INTLITERAL rule and place it just under the parser rule integerLiteral like this
integerLiteral
: (PLUS|MINUS)? DIGITS
;
现在它似乎工作得很好!
Now it seems to work just fine!
我觉得如果我能够理解为什么会这样,我就会开始理解我正在经历的一些特质.
I feel that if I am able to understand why this is I'll begin to understand some idiosyncrasies that I am experiencing.
推荐答案
词法分析器以下列方式创建标记:
The lexer creates tokens in the following manner:
- 尝试为单个令牌匹配尽可能多的字符
- 如果两个令牌匹配相同的字符,让第一个定义的获胜"
根据上述 2 条规则的信息,您将看到您的规则:
Given the information from the 2 rules above, then you will see that your rules:
DIGITS: DIGIT+;
INTLITERAL: (PLUS|MINUS)? DIGITS;
是问题所在.对于输入 100
,会创建一个 DIGITS
标记:规则 2 在这里适用:两个规则都匹配 100
,但由于 DIGITS
> 在 INTLITERAL
之前定义,一个 DIGITS
令牌被创建.
are the problem. For the input 100
a DIGITS
token is created: rule 2 applies here: both rules match 100
, but since DIGITS
is defined before INTLITERAL
, a DIGITS
token is created.
将 INTLITERAL
移到 DIGITS
上方:
INTLITERAL: (PLUS|MINUS)? DIGITS;
DIGIT: '0'..'9';
DIGITS: DIGIT+;
但现在请注意,DIGIT
和 DIGITS
永远不会自己成为标记,因为 INTLITERAL
将始终首先匹配.在这种情况下,您可以将这两个规则都设置为 fragment
s,然后将它们放在哪里并不重要,因为 fragment
规则仅在其他词法分析器规则中使用(不在解析器规则中)
But now notice that DIGIT
and DIGITS
will never become tokens on their own because INTLITERAL
will always be matched first. In this case, you can make both of these rules fragment
s, and then it doesn't matter where you place them because fragment
rules are only used inside other lexer rules (not in parser rules)
制作DIGIT
和DIGITS
片段
fragment DIGIT: '0'..'9';
fragment DIGITS: DIGIT+;
INTLITERAL: (PLUS|MINUS)? DIGITS;
解决方案 3
或者更好的是,不要将运算符粘在 INTLITERAL
上,而是将其与一元表达式匹配:
Solution 3
Or better, don't glue the operator on the INTLITERAL
but match it in an unary expression:
expression
: (MINUS | PLUS) expression
| expression (MINUS | PLUS) expression
| integerLiteral
;
integerLiteral
: INTLITERAL
;
PLUS: '+';
MINUS: '-';
fragment DIGIT: '0'..'9';
INTLITERAL: DIGIT+;
这篇关于ANTLR 词法分析器规则似乎只作为解析器规则的一部分工作,而不是另一个词法分析器规则的一部分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!