问题描述
我的语法(如下(从原文删减))需要一些重叠的规则
My grammar (as follows (trimmed down from the original)) requires somewhat overlapping rules
grammar NOVIANum;
statement : (priorityStatement | integerStatement)* ;
priorityStatement : T_PRIO TwoDigits ;
integerStatement : T_INTEGER Integer ;
WS : [ \t\r\n]+ -> skip ;
T_PRIO : 'PRIO' ;
T_INTEGER : 'INTEGER' ;
Integer: OneToNine Digit* | ZERO ;
TwoDigits : Digit Digit ;
fragment OneToNine : ('1'..'9') ;
fragment Digit: ('0'..'9');
ZERO : [0] ;
所以Integer"和TwoDigits"在一定程度上重叠.
so "Integer" and "TwoDigits" overlap to a certain extent.
以下输入
INTEGER 10
PRIO 10
结果
line 2:5 mismatched input '10' expecting TwoDigits
当整数在 TwoDigits 之前并且在
when Integer precedes TwoDigits and in
line 1:8 mismatched input '10' expecting Integer
当TwoDigits 在语法中位于Integer 之前.
when TwoDigits precedes Integer in the grammar.
有没有办法解决这个问题?
Is there a way around this ?
谢谢 - 亚历克斯
谢谢@GRosenberg,你的建议当然适用于这个小例子,但是当我将它整合到我的完整语法中时,它肯定会导致不同的不匹配输入错误.
Thanks @GRosenberg, your suggestion, of course, worked for this small example, but when I integrated this into my full grammar it led to different mismatched input errors sure enough.
原因是另一个需要'[1-4]'范围的词法分析器规则,所以我想我会聪明地把它变成
The reason being another lexer rule which requires a range of '[1-4]', so I thought I'll be clever and turn it into
grammar NOVIANum;
statement : (priorityT | integerT | levelT )* ;
priorityT : T_PRIO twoDigits ;
integerT : T_INTEGER integer ;
levelT : T_LEVEL levelNumber ;
levelNumber : ( ZERO DIGIT ) | ( OneToFour (ZERO | DIGIT) ) ;
integer: ZERO* ( DIGIT ( DIGIT | ZERO )* ) ;
twoDigits : (ZERO | DIGIT) ( ZERO | DIGIT ) ;
oneToFour : OneToFour (DIGIT | ZERO) ;
WS : [ \t\r\n]+ -> skip ;
T_INTEGER : 'INTEGER' ;
T_LEVEL : 'LEVEL' ;
T_PRIO : 'PRIO' ;
DIGIT: OneToFour | FiveToNine ;
ZERO : '0' ;
OneToFour : [1-4] ;
FiveToNine : [5-9] ;
这仍然适用于以前的输入,但是...
This still works for the previous inputs but ...
INTEGER 350
PRIO 10
LEVEL 01
LEVEL 05
LEVEL 10
LEVEL 49
结果
[@0,0:6='INTEGER',<2>,1:0]
[@1,8:8='3',<5>,1:8]
[@2,9:9='5',<5>,1:9]
[@3,10:10='0',<6>,1:10]
[@4,12:15='PRIO',<4>,2:0]
[@5,17:17='1',<5>,2:5]
[@6,18:18='0',<6>,2:6]
[@7,20:24='LEVEL',<3>,3:0]
[@8,26:26='0',<6>,3:6]
[@9,27:27='1',<5>,3:7]
[@10,29:33='LEVEL',<3>,4:0]
[@11,35:35='0',<6>,4:6]
[@12,36:36='5',<5>,4:7]
[@13,38:42='LEVEL',<3>,5:0]
[@14,44:44='1',<5>,5:6]
[@15,45:45='0',<6>,5:7]
[@16,47:51='LEVEL',<3>,6:0]
[@17,53:53='4',<5>,6:6]
[@18,54:54='9',<5>,6:7]
[@19,55:54='<EOF>',<-1>,6:8]
line 5:6 no viable alternative at input '1'
line 6:6 no viable alternative at input '4'
(statement (integerT INTEGER (integer 3 5 0)) (priorityT PRIO (twoDigits 1 0)) (levelT LEVEL (levelNumber 0 1)) (levelT LEVEL (levelNumber 0 5)) (levelT LEVEL (levelNumber 1 0)) (levelT LEVEL (levelNumber 4 9)))
我在这里遗漏了什么?
编辑 2:
好的,当然是在这里回答我自己的问题
Ok, answering my own question here, of course
DIGIT: OneToFour | FiveToNine ;
踢到不该踢的地方,即使是这种组合形式,所以解决这个问题的唯一方法 - 我能想到 - 是
kicks in where it shouldn't, even in this combined form,so about the only way to get around this - I can think of - would be
grammar NOVIANum;
statement : (priorityT | integerT | levelT )* ;
priorityT : T_PRIO twoDigits ;
integerT : T_INTEGER integer ;
levelT : T_LEVEL levelNumber ;
levelNumber : ( ZERO (OneToFour | FiveToNine) | ( OneToFour (ZERO | (OneToFour | FiveToNine)) ) ) ;
integer: ZERO* ( (OneToFour | FiveToNine) ( (OneToFour | FiveToNine) | ZERO )* ) ;
twoDigits : (ZERO | (OneToFour | FiveToNine)) ( ZERO | (OneToFour | FiveToNine) ) ;
WS : [ \t\r\n]+ -> skip ;
T_INTEGER : 'INTEGER' ;
T_LEVEL : 'LEVEL' ;
T_PRIO : 'PRIO' ;
// DIGIT: OneToFour | FiveToNine;
ZERO : '0' ;
OneToFour : [1-4] ;
FiveToNine : [5-9] ;
因为当我为它创建解析器规则时,就像
because when I create a parser rule for it like
oneToNine : OneToFour | FiveToNine ;
它会给我这个
integerT INTEGER (integer (oneToNine 3) (oneToNine 5) 0))
这不仅丑陋而且更难处理
which is ugly and harder to handle than just
(integerT INTEGER (integer 3 5 0))
推荐答案
作为设计的一个普遍问题,始终尝试在同一级别、解析器或词法分析器中使用可区分的元素及其对象 (T_PRIO -> TwoDigits).假设 Integer
和 TwoDigits
规则的语义性质很重要,将它们提升到解析器并让词法分析器只生成数字.也就是说,不要过度限制词法分析器.
As an general issue of design, always try to work with distinguishing elements and their objects (T_PRIO -> TwoDigits) at the same level, parser or lexer. Presuming the semantic nature of the Integer
and TwoDigits
rules is important, promote them to the parser and let the lexer only produce digits. That is, don't over-constrain the lexer.
在解析器中,您可以让 integer
规则功能隐藏 twoDigits
规则,除了 priorityStatement
规则:
In the parser, you can let the integer
rule functionally hide the twoDigits
rule except in the evaluation of the priorityStatement
rule:
priorityStatement : T_PRIO twoDigits ;
integerStatement : T_INTEGER integer ;
integer: ZERO | ( DIGIT ( DIGIT | ZERO )* ) ;
twoDigits : DIGIT DIGIT ;
T_PRIO : 'PRIO' ;
T_INTEGER : 'INTEGER' ;
DIGIT : [1-9] ;
ZERO : '0' ;
这篇关于重叠规则 - 不匹配的输入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!