如何在同一个输入文件上切换两个词法分析器

如何在同一个输入文件上切换两个词法分析器

本文介绍了flex/bison:如何在同一个输入文件上切换两个词法分析器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何移交打开的文件,例如由另一台扫描仪读取到下一台扫描仪-并将其提供给解析器?

How can I handover an open file e.g. read by another scanner to the next scanner - and give it to the parser ?

推荐答案

Flex缓冲区无法轻松地从一台扫描仪转移到另一台扫描仪.许多细节是扫描仪专用的,需要进行逆向工程,从而导致可维护性下降.

Flex buffers cannot easily be transferred from one scanner to another. Many details are private to the scanner and would need to be reverse-engineered, with the consequent loss of maintainability.

但是,只要语义类型兼容,就不难将两个(或多个)扫描程序定义组合到一个扫描程序中.只需给它们提供不同的启动条件即可.由于即使在扫描仪动作之外也可以设置启动条件,因此从一个扫描仪定义切换到另一个扫描仪定义很简单.

However, it is not difficult to combine two (or more) scanner definitions into a single scanner, provided that the semantic types are compatible. It is simply necessary to give them different start conditions. Since the start condition can be set even outside of a scanner action, it is trivial to switch from one scanner definition to the other.

由于Flex扫描仪是基于表格的,因此将两个扫描仪组合在一起并没有真正的效率;确实,不复制代码可能会有一些价值.组合表可能会比单个表的总和稍大,因为可能存在更多的字符等效类,但另一方面,较大的表可能会允许更好的表压缩.这些影响都不会很明显.

Since Flex scanners are table-based, there is no real inefficiency in combining the two scanners; indeed, there may be some value in not duplicating the code. The combined table may be slightly larger than the sum of the individual tables, because there are likely to be more character equivalence classes, but on the other hand the larger table may allow better table compression. Neither of these effects is likely to be noticeable.

这是一个简单但可能有用的示例.该解析器读取文件,并用评估的表达式替换${arithmetic expressions}. (由于只是一个示例,只允许使用非常基本的表达式,但是应该易于扩展.)

Here's a simple but possibly useful example. This parser reads a file and substitutes ${arithmetic expressions} with the evaluated expression. (Since its just an example, only very basic expressions are allowed but it should be easy to extend.)

由于词法扫描程序需要在开始条件SC_ECHO下启动,因此需要对其进行初始化.就个人而言,我希望从INITIAL开始以避免在这种简单情况下进行初始化,但是有时扫描程序需要能够处理各种启动条件,因此我保留了代码.可以改进错误处理,但是功能.

Since the lexical scanner needs to start in start condition SC_ECHO, it needs to be initialized. Personally, I'd prefer to start in INITIAL to avoid this initialization in this simple case, but sometimes scanners need to be able to handle various start conditions, so I left the code in. The error handling could be improved, but it's functional.

解析器使用非常简单的error规则重新同步并跟踪替换错误.非终结符substfilestart的语义值是文件的错误计数. expr的语义值是表达式的值.在这种简单情况下,它们都是整数,因此yylval的默认类型有效.

The parser uses a very simple error rule to resynchronize and keep track of substitution errors. The semantic value of the non-terminals subst, file and start is the error count for the file; the semantic value for expr is the value of the expression. In this simple case, they are both just integers so the default type for yylval works.

未终止的替换无法正常处理;特别是,如果在词法扫描期间读取了EOF以进行替换,则不会在输出中插入任何指示.我将其保留为练习. :)

Unterminated substitutions are not handled gracefully; in particular, if EOF is read during the lexical scan for a substitution, no indication is inserted into the output. I leave fixing that as an exercise. :)

这是词法分析器:

%{
#include "xsub.tab.h"
%}
%option noinput nounput noyywrap nodefault
%option yylineno
%x SC_ECHO
%%
   /* In a reentrant lexer, this would go into the state object */
   static int braces;

   /* This start condition just echos until it finds ${... */
<SC_ECHO>{
  "${"        braces = 0; BEGIN(INITIAL);
  [^$\n]+     ECHO;
  "$"         ECHO;
  \n          ECHO;
}
 /* We need to figure out where the substitution ends, which is why we can't
  * just use a standard calculator. Here we deal with terminations.
  */
"{"           ++braces; return '{';
"}"           { if (braces) { --braces; return '}'; }
                else        { BEGIN(SC_ECHO); return FIN; }
              }

 /* The rest is just a normal calculator */
[0-9]+        yylval = strtol(yytext, NULL, 10); return NUMBER;
[[:blank:]]+  /* Ignore white space */
\n            /* Ignore newlines, too (but could also be an error) */
.             return yytext[0];

%%
void initialize_scanner(void) {
  BEGIN(SC_ECHO);
}

解析器导出单个接口:

int parseFile(FILE *in, *out);

,如果一切顺利,则返回0,否则返回错误替换的数量(对上述问题使用未终止的替换进行模运算).这是文件:

which returns 0 if all went well, and otherwise the number of incorrect substitutions (modulo the issue mentioned above with unterminated substitutions). Here's the file:

%{
#include <stdio.h>
int yylex(void);
void yyerror(const char* msg);
void initialize_scanner(void);

extern int yylineno;
extern FILE *yyin, *yyout;
%}
%token NUMBER FIN UNOP
%left '+' '-'
%left '*' '/' '%'
%nonassoc UNOP

%define parse.lac full
%define parse.error verbose
%%
start: file          { if ($1) YYABORT; else YYACCEPT; }
file :               { $$ = 0; }
     | file subst    { $$ = $1 + $2; }
subst: expr FIN      { fprintf(yyout, "%d", $1); $$ = 0; }
     | error FIN     { fputs("${ BAD SUBSTITUTION }", yyout); $$ = 1; }
expr : NUMBER
     | '-' expr %prec UNOP { $$ = -$2; }
     | '(' expr ')'  { $$ = $2; }
     | expr '+' expr { $$ = $1 + $3; }
     | expr '-' expr { $$ = $1 - $3; }
     | expr '*' expr { $$ = $1 * $3; }
     | expr '/' expr { $$ = $1 / $3; }
     | expr '%' expr { $$ = $1 % $3; }
%%
void yyerror(const char* msg) {
  fprintf(stderr, "%d: %s\n", yylineno, msg);
}

int parseFile(FILE* in, FILE* out) {
  initialize_scanner();
  yyin = in;
  yyout = out;
  return yyparse();
}

还有一个简单的驱动程序:

And a simple driver:

#include <stdio.h>
int parseFile(FILE* in, FILE* out);
int main() {
  return parseFile(stdin, stdout);
}

这篇关于flex/bison:如何在同一个输入文件上切换两个词法分析器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-19 21:13