问题描述
我正在使用JavaScript和语法分析语法为Haskell编写词法分析器,我使用的实现是 PEG.js .
我在使它适用于保留字方面遇到问题,如此处的简化形式所示:
I am writing a lexer for Haskell using JavaScript and Parsing Expression Grammar, the implementation I use being PEG.js.
I have a problem with making it work for reserved words, as demonstrated in a simplified form here:
program = ( word / " " )+
word = ( reserved / id )
id = ( "a" / "b" )+
reserved = ( "aa" )
这里的重点是获得一系列令牌,这些令牌可以是a:s和/或b:s的任意序列,也可以是序列"aa",并用空格分隔.
我真正得到的是不是每个空格的每个令牌都被识别为id
,或者应该被识别为id
的令牌已将所有初始的一对a:s吞噬为reserved
,例如
"aab"被识别为reserved "aa"
,后跟id "b"
.
The point here is to get a series of tokens that are either arbitrary sequences of a:s and/or b:s or the sequence "aa", and they are separated by spaces.
What I really get is either that every token that is not a space is recognized as id
or that a token that should be recognised as id
has all initial pairs of a:s eaten up as reserved
, e.g.
"aab" gets recognized as reserved "aa"
followed by id "b"
.
Haskell词汇规范解决这种歧义的方式是指定id,如下所示:
The way the Haskell lexical specification solves this ambiguity is to specify id like this:
id = ( "a" / "b" )+[BUT NOT reserved]
我尝试使用PEG的各种组合来复制它!和& -运算符可达到相同的效果,但还没有找到使它正常工作的方法.
解决方案:
I have tried replicating this using various combinations of the PEG ! and & -operators to acheive the same effect, but have not found a way to get this to work properly.
The solution:
id = !reserved ( "a" / "b" )+
我在几个地方看到的建议
不起作用.
这是对特定的PEG实施的限制吗,是PEG本身还是我的方法(希望如此)?
that I've seen suggested in several places does not work.
Is this a limitation in the particular PEG-implementation, PEG in itself or (hopefully) my methods?
提前谢谢!
推荐答案
!reserved ident
在任何PEG实现中都是完全可以接受的技术,并且PEG.js也支持该技术.顺便说一句,您应该在reserved
的定义之后添加!id
.
!reserved ident
is a perfectly acceptable technique in any PEG implementation, and PEG.js seems to support it as well. Btw, you should add !id
after the definition of reserved
.
这篇关于从解析表达语法(PEG.js)中的指定集合中排除某些元素?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!