本文介绍了带有flex和bison的END OF FILE令牌(仅在没有令牌的情况下有效)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

好吧,这是一个奇怪的问题,因为我在这里拥有的东西以我想要的方式工作.我正在做的是为lambda微积分表达式编写解析器.因此,表达式可以是以下四项之一:

OK this is kind of an odd question because what I have here works the way I want it to. What I'm doing is writing a parser for a lambda calculus expression. So an expression can be one of four things:

  • 变量
  • 恒定
  • (表达式)
  • (lambda variable.expression)

现在您可以看到,最后两个表达式中包含表达式.我试图做的是确定整体表达方式,以便我报告它是哪种类型.因此,例如表达式((lambda x.(f1 x))100)是整体的组合.我的想法是在到达文件末尾时从flex返回END令牌.我的代码如下:

Now as you can see, the last two expressions have expressions within them. What I was trying to do was determine the overall expression so I can report which type it is. So for example the expression ((lambda x.(f1 x)) 100) is a combination overall. My idea was to return an END token from flex when it reached the end of file. My code looks like this:

overallexpr: combo END { printf(" The overall expression is a combination\n"); } |
         constant END { printf(" The overall expression is a constant\n"); } |
         VARIABLE END { printf(" The overall expression is a variable\n"); } |
         l_expr END { printf(" The overall expression is a lambda expression\n"); }
;

expr: combo | constant | VARIABLE | l_expr
;

combo: LPARENS expr expr RPARENS
;

constant: FUNCTION | NUMBER
;

l_expr: LPARENS LAMBDA VARIABLE DOT expr RPARENS
;

如果我将END令牌放在诸如combo END这样的totalexpr的四种可能性之后,而不仅仅是combo,那是行不通的.但是解析器会接收到END令牌.如果我在读取每个令牌时打印它们(带有变量,函数和数字值),则它看起来像这样

If I put that END token after the four possibilities in overallexpr like combo END instead of just combo, it doesn't work. But the END token is received by the parser. If I print each token as it is read (with variable, function, and number values) it looks like this

LPARENS  LPARENS  LAMBDA  VARIABLE x  DOT  LPARENS  FUNCTION f1  VARIABLE x  RPARENS  RPARENS  NUMBER 100  RPARENS  END Sorry, Charlie

这可能很难说,但这应该可行.组合以RPARENS结尾,并且紧随其后的是END令牌.但是,它不能作为一个整体表达来评估.但是,如果我取出END令牌,它似乎每次都起作用.即使totalexpr和expr的结果完全相同,我总是会打印出一条总体消息.输出与最后一个相同,除了它在END令牌之前显示整个表达式是一个组合".所以我的问题是为什么?野牛总是总是先尝试早期的作品吗?为何不使用END却不使用END呢?尤其是因为您可以在END令牌表明它是一个组合之后立即看到它.我只是想更好地了解Bison的工作原理.

It may be hard to tell but this should work. The combination ends with the RPARENS and there's an END token right after it. But it doesn't evaluate as an overall expression. However if I take out the END tokens it seems to work every time. I always get an overall message printed, even though the productions of overallexpr and expr are exactly the same. The output is identical to the last one except it says "The overall expression is a combination" before the END token. So my question is why? Does bison always just try the earlier productions first? And why would it work without the END but not with it? Especially because you can see the END token right after it says it's a combination. I'm just trying to get a better understanding of how Bison works.

推荐答案

在看不到您的代码的情况下很难分辨这里发生了什么(无论如何,我也不想这么做),但是我会有一个猜测:我的猜测是您要用END令牌替换标准的yylex EOF指示(即返回0).如果野牛解析器看不到EOF,则它永远也不会完成解析.

It's a little hard to tell what's going on here without seeing your code (and I don't really want to wade through it, anyway), but I'll hazard a guess: my guess is that you're replacing the standard yylex EOF indication (i.e., returning 0) with your END token. If the bison parser never sees an EOF, it never finishes the parse.

实际上,野牛会自己创造一种特殊的作品:

In effect, bison creates a special production all of its own:

__parse__: __start__ $;

parse 是(实际上未命名的)生产,而__start__是您已经声明为%start的东西(或者是第一个非终结符,如果您没有明确声明的话) .在您的情况下,我想它是overallexpr. $是通常用来表示EOF标记的符号.

parse is the (actually unnamed) production, and __start__ is whatever you've declared as %start (or the first non-terminal, if you don't declare it explicitly). In your case, I suppose it's overallexpr. $ is the symbol conventionally used to indicated the EOF mark.

现在,野牛解析器动作何时发生?尽管在某些情况下,它们可能会发生在您认为会发生的地方(即,在生产中的最后一个令牌之后),但通常不会发生,直到解析器窥视以下令牌.可以这样做;这就是为什么它被称为LALR(1)解析器的原因:1是在决定如何处理已获得的令牌之前允许其查看的将来令牌的数量.它几乎总是需要此信息,并且即使您和我似乎都不需要,也常常像它那样工作.

Now, when do bison parser actions happen? Although in some cases, they can happen where you think they will (i.e. immediately after the last token in the production), they usually don't happen until the parser takes a peek at the following token. It's allowed to do that; that's why it's called an LALR(1) parser: the 1 is the number of future tokens it's allowed to look at before deciding exactly what to do with the ones it's already got. It almost always needs this information, and often works as though it did even if it seems to you and me that it doesn't.

因此,解析器很可能不会真正进行overallexpr归约-换句话说,它不会执行与overallexpr规则相关的动作-直到它使自己确信结束文件标记是下一个标记.

So in all probability, the parser will not actually do the overallexpr reduction -- or, in other words, it won't execute the action associated with the overallexpr rule -- until it convinces itself that the end-of-file marker is the next token.

现在,如果您将END令牌排除在规则之外,而词法分析器实际上返回EOF,则当bison看到EOF时,它将进行还原.

Now, if you leave your END token out of the rule and the lexer actually returns EOF, then bison do the reduction when it sees the EOF.

这篇关于带有flex和bison的END OF FILE令牌(仅在没有令牌的情况下有效)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-05 18:49
查看更多