本文介绍了Antlr 错误策略跳过令牌直到规则再次匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试过这个 解决方案,但它似乎对我不起作用

I tried this solution but it didn't seem to work for me

这是我的语法的摘录:

module
    : BEGIN MODULE IDENT STRING module_element* END MODULE
    ;

module_element
    : element_1 | element_2 | element_3 | ...
    ;

每个元素下面有一棵更大的树.现在,当发生 RecognitionException 时,我想使用令牌,直到下一个 module_element 匹配或父 END MODULE 匹配为止.

There is a bigger tree below each element. Now when a RecognitionException occurs I want to consume tokens until either the next module_element matches or the parent END MODULE matches.

关于如何在继承自 DefaultErrorStrategy 的类中执行此操作的任何提示?

Any hints on how to do this inside a class inheriting from DefaultErrorStrategy?

这是一个 MCVE:

程序.cs

namespace AntlrExample
{
    class Program
    {
        static void Main(string[] args)
        {
            var fileToParse = @"C:\temp\MyGrammarExample.txt";

            try
            {
                Parse(fileToParse);
            }
            catch (Exception e)
            {
                Console.WriteLine("Exception: " + e);
            }
        }

        private static void Parse(string filePath)
        {
            var lexer = new MyGrammarLexer(new AntlrFileStream(filePath, Encoding.Default));

            var parser = new MyGrammarParser(new CommonTokenStream(lexer));

            parser.AddParseListener(new MyGrammarListener());

            parser.startnode();
        }
    }
}

MyGrammar.g4:

MyGrammar.g4:

grammar MyGrammar;

@parser::members
{
    protected const int EOF = Eof;
}

@lexer::members
{
    protected const int EOF = Eof;
    protected const int HIDDEN = Hidden;
}

startnode
    :   module
    ;

module
    : BEGIN MODULE IDENT STRING module_element* END MODULE
    ;

module_element
    :   element_1 | element_2
    ;

element_1
    :   BEGIN ELEMENT1 name=IDENT desc=STRING other1=IDENT other2=IDENT END ELEMENT1
    ;

element_2
    :   BEGIN ELEMENT2 name=IDENT desc=STRING other1=IDENT other2=IDENT other3=INT END ELEMENT2
    ;

BEGIN : 'BEGIN';
MODULE: 'MODULE';
END: 'END';
ELEMENT1 : 'ELEMENT1';
ELEMENT2 : 'ELEMENT2';

IDENT
    : LETTER (LETTER|'0'..'9'|'['|']'|'.')*
    ;

fragment LETTER
    : 'A'..'Z' | 'a'..'z' | '_'
    ;

STRING
    : '"' ('\\' (.) | '"''"' | ~( '\\' | '"'))* '"'
    ;

INT
    : MINUS? DIGIT+
    ;

fragment MINUS
    : '-'
    ;

DIGIT
    : '0'..'9'
    ;

WS
    : ( ' ' | '\t' | '\r' | '\n')+ -> skip
    ;

MyGrammarListener.cs

MyGrammarListener.cs

namespace AntlrExample.Parser
{
    public class MyGrammarListener : MyGrammarBaseListener
    {
        public override void ExitElement_1(MyGrammarParser.Element_1Context context)
        {
            Console.WriteLine(string.Format("Just parsed an ELEMENT1: {0} {1} {2} {3}", context.name.Text, context.desc.Text, context.other1.Text, context.other2.Text));
        }

        public override void ExitElement_2(MyGrammarParser.Element_2Context context)
        {
            Console.WriteLine(string.Format("Just parsed an ELEMENT2: {0} {1} {2} {3} {4}", context.name.Text, context.desc.Text, context.other1.Text, context.other2.Text, context.other3.Text));
        }
    }
}

MyGrammarExample.txt

MyGrammarExample.txt

BEGIN MODULE MyModule "This is the main module"

    BEGIN ELEMENT1 MyElement1 "This is the first element"
        Something
        Anything
    END ELEMENT1

    BEGIN ELEMENT1 MyElement2 "This is the second element"
        SomethingMore
        AnythingMore
    END ELEMENT1

    BEGIN ELEMENT2 MyFirstElement2 "This one will fail"
        Foo
        Bar
        HereShouldBeAnInt
    END ELEMENT2

    BEGIN ELEMENT2 MySecondElement2 "This one should parse even though the parser failed to parse the one before"
        RealFoo
        RealBar
        34
    END ELEMENT2

END MODULE

推荐答案

您应该能够使用此错误策略类来完成此操作:

You should be able to accomplish this with this error strategy class:

internal class MyGrammarErrorStrategy : DefaultErrorStrategy
{
    public override void Recover(Parser recognizer, RecognitionException e)
    {
        // This should should move the current position to the next 'END' token
        base.Recover(recognizer, e);

        ITokenStream tokenStream = (ITokenStream)recognizer.InputStream;

        // Verify we are where we expect to be
        if (tokenStream.La(1) == MyGrammarParser.END)
        {
            // Get the next possible tokens
            IntervalSet intervalSet = GetErrorRecoverySet(recognizer);

            // Move to the next token
            tokenStream.Consume();

            // Move to the next possible token
            // If the errant element is the last in the set, this will move to the 'END' token in 'END MODULE'.
            // If there are subsequent elements in the set, this will move to the 'BEGIN' token in 'BEGIN module_element'.
            ConsumeUntil(recognizer, intervalSet);
        }
    }
}

然后相应地设置错误处理程序:

And then set the error handler, accordingly:

parser.ErrorHandler = new MyGrammarErrorStrategy();

我们的想法是我们首先允许默认的 Recover 实现将当前位置移动到重新同步集",在这种情况下,它是下一个 END 标记.随后,我们使用提供的错误恢复集消耗额外的令牌以将位置移动到我们需要的位置.这个结果位置将根据错误的 module_element 是否是模块中的最后一个而有所不同.

The idea is that we first allow the default Recover implementation to move the current position to the "resynchronization set," which in this case is the next END token. Subsequently, we consume additional tokens using the provided error recovery set to move the position to where we need it to be. This resulting position will differ based on whether or not the errant module_element is the last in the module.

这篇关于Antlr 错误策略跳过令牌直到规则再次匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-18 12:53