问题描述
我刚开始使用 Antlr 并且卡住了.我有以下语法,并试图解决解析输入的歧义,如 Field:ValueString.
I just started using Antlr and am stuck. I have the below grammar and am trying to resolve the ambiguity to parse input like Field:ValueString.
expression : Field ':' ValueString;
Field : Letter LetterOrDigit*;
ValueString : ~[:];
Letter : [a-zA-Z];
LetterOrDigit : [a-zA-Z0-9];
WS: [ \t\r\n\u000C]+ -> skip;
假设 a:b 被传入语法,a 和 b 都被标识为 Field.如何在 Antlr4 (C#) 中解决此问题?
suppose a:b is passed in to the grammar, a and b are both identified as Field. How do I resolve this in Antlr4 (C#)?
推荐答案
您可以在词法分析器规则中使用语义谓词来执行前瞻(或后视)而不消耗字符 (ANTLR4 词法分析器中的否定前瞻)
You can use a semantic predicate in your lexer rules to perform lookahead (or behind) without consuming characters (ANTLR4 negative lookahead in lexer)
在这种情况下,为了消除歧义,您可以检查 Field
规则之后的字符是否为 :
或者您可以检查 ValueString 之前的字符
是 :
.
In you case, to remove ambiguity, you can check if the char after the Field
rule is :
or you can check if the char before the ValueString
is :
.
Ïn 第一种情况:
expression : Field ':' ValueString;
Field : Letter LetterOrDigit* {_input.LA(1) == ':'}?;
ValueString : ~[:];
Letter : [a-zA-Z];
LetterOrDigit : [a-zA-Z0-9];
WS: [ \t\r\n\u000C]+ -> skip;
在第二个中(请注意Field
和ValueString
的顺序已经颠倒了):
In the second one (please note that Field
and ValueString
order have been inversed):
expression : Field ':' ValueString;
ValueString : {_input.LA(-1) == ':'}? ~[:];
Field : Letter LetterOrDigit*;
Letter : [a-zA-Z];
LetterOrDigit : [a-zA-Z0-9];
WS: [ \t\r\n\u000C]+ -> skip;
还要考虑对 Letter
和 LetterOrDigit
fragment Letter : [a-zA-Z];
fragment LetterOrDigit : [a-zA-Z0-9];
[使用片段关键字]您还可以定义不是令牌而是帮助识别令牌的规则.这些片段规则不会导致解析器可见的令牌."(来源 https://theantlrguy.atlassian.net/wiki/display/ANTLR4/词法分析器+规则)
"[With fragment keyword] You can also define rules that are not tokens but rather aid in the recognition of tokens. These fragment rules do not result in tokens visible to the parser." (source https://theantlrguy.atlassian.net/wiki/display/ANTLR4/Lexer+Rules)
这篇关于如何解决简单的歧义的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!