问题描述
我正在为一个项目编写解析器,但遇到了问题.这是该问题的一个独立示例:
I'm writing a parser for a project and got stuck on an issue. Here's a self contained example of the problem:
%error-verbose
%token ID
%token VAR
%token END_VAR
%token CONSTANT
%token AT
%token VALUE
%%
unit: regular_var_decl
| direct_var_decl;
regular_var_decl: VAR constant_opt ID ':' VALUE ';' END_VAR;
constant_opt: /* empty */ | CONSTANT;
direct_var_decl: VAR ID AT VALUE ':' VALUE ';' END_VAR;
%%
#include <stdlib.h>
#include <stdio.h>
yylex() {
static int i = 0;
static int tokens[] = {
VAR,
ID, ':', VALUE, ';',
END_VAR,
0
};
return tokens[i++];
};
yyerror(str) char *str; {
printf("fail: %s\n", str);
};
main() {
yyparse();
return 0;
};
一个人可以建造它 bison test.y&&cc test.tab.c&&./a.out
.
它警告我 constant_opt
由于冲突而无用.
It warns me that constant_opt
is useless due to conflicts.
使用LALR(2)可以解决这种歧义,因为在 ID
之后,它可以找到':'或 AT
...野牛?
This ambiguity could be solved by using LALR(2), since after ID
it could find ':' or AT
... How could I solve this issue on bison?
推荐答案
一个简单的解决方案是不缩写可选的CONSTANT:
A simple solution is to just not abbreviate the optional CONSTANT:
regular_var_decl: VAR ID ':' VALUE ';' END_VAR;
constant_var_decl: VAR CONSTANT ID ':' VALUE ';' END_VAR;
direct_var_decl: VAR ID AT VALUE ':' VALUE ';' END_VAR;
这允许将缩减决策推迟到知道足够的信息之前.(如果有用,您可以将':'VALUE';'END_VAR
分解为非终结符.)
That allows the reduction decision to be deferred until enough information is known. (You could factor ':' VALUE ';' END_VAR
into a non-terminal if that were useful.)
另一种可能性是保持语法不变,并要求野牛产生一个GLR解析器(%glr-parser
).GLR解析器将有效保留两个(或多个)并行解析,直到可以解决歧义为止,这肯定会解决 constant_opt
问题.(请注意,移位/减少冲突仍由野牛报告;在运行时发现歧义句子之前,它无法判断语言是否真的模棱两可.)在很多时候,不需要对语法进行任何其他更改,但是确实会使解析速度变慢.
Another possibility is leave the grammar as it was, and ask bison to produce a GLR parser (%glr-parser
). The GLR parser will effectively retain two (or more) parallel parses until the ambiguity can be resolved, which should certainly fix the constant_opt
problem. (Note that the shift/reduce conflicts are still reported by bison; it cannot tell whether the language is actually ambiguous until an ambiguous sentence is discovered at runtime.) Much of the time, no additional change to the grammar needs to be made, but it does slow the parse down a little bit.
最后一种可能性(可能在这里不太有用)是接受语言的超集,然后在操作中发出错误消息:
A final possibility, probably less useful here, is to accept a superset of the language and then issue an error message in an action:
var_decl: VAR const_opt ID at_value_opt ':' VALUE ';' END_VAR {
if (/* pseudocode */ $2 && $4) { /* flag a syntax error */ }
}
这取决于两个 opt
终端返回的语义值,该语义值可以通过某种方式查询为空.
That depends on the two opt
terminals returning a semantic value which can be interrogated somehow for empty.
这篇关于解决前两次野牛冲突的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!