问题描述
ANTLR:是否可以通过在内部嵌入语法(使用自己的词法分析器)来制作语法?
ANTLR: Is it possible to make grammar with embed grammar (with it's own lexer) inside?
例如,以我的语言,我可以使用嵌入式SQL语言:
For example in my language I have ability to use embed SQL language:
var Query = [select * from table];
with Query do something ....;
使用ANTLR可以吗?
Is it possible with ANTLR?
推荐答案
如果您的意思是是否可以在一个语法中定义两种语言(使用单独的词法分析器),那么答案是:不,那是不可能的.
If you mean whether it is possible to define two languages in a single grammar (using separate lexers), then the answer is: no, that's not possible.
但是,如果问题是是否可以将两种语言解析为一个AST,那么答案是:是的,这是可能的.
However, if the question is whether it is possible to parse two languages into a single AST, then the answer is: yes, it is possible.
您只需要:
- 用自己的语法定义两种语言;
- 在您的主语法中创建一个词法分析器规则,以捕获嵌入式语言的全部输入;
- 使用重写规则,该规则调用一个自定义方法,该方法将解析外部AST并使用
{
...}
将其插入到主AST中(请参阅主语法(MyLanguage.g
)中的expr
规则).
- define both languages in their own grammar;
- create a lexer rule in you main grammar that captures the entire input of the embedded language;
- use a rewrite rule that calls a custom method that parses the external AST and inserts it in the main AST using
{
...}
(see theexpr
rule in the main grammar (MyLanguage.g
)).
grammar MyLanguage;
options {
output=AST;
ASTLabelType=CommonTree;
}
tokens {
ROOT;
}
@members {
private CommonTree parseSQL(String sqlSrc) {
try {
MiniSQLLexer lexer = new MiniSQLLexer(new ANTLRStringStream(sqlSrc));
MiniSQLParser parser = new MiniSQLParser(new CommonTokenStream(lexer));
return (CommonTree)parser.parse().getTree();
} catch(Exception e) {
return new CommonTree(new CommonToken(-1, e.getMessage()));
}
}
}
parse
: assignment+ EOF -> ^(ROOT assignment+)
;
assignment
: Var Id '=' expr ';' -> ^('=' Id expr)
;
expr
: Num
| SQL -> {parseSQL($SQL.text)}
;
Var : 'var';
Id : ('a'..'z' | 'A'..'Z')+;
Num : '0'..'9'+;
SQL : '[' ~']'* ']';
Space : ' ' {skip();};
MiniSQL.g
grammar MiniSQL;
options {
output=AST;
ASTLabelType=CommonTree;
}
parse
: '[' statement ']' EOF -> statement
;
statement
: select
;
select
: Select '*' From ID -> ^(Select '*' From ID)
;
Select : 'select';
From : 'from';
ID : ('a'..'z' | 'A'..'Z')+;
Space : ' ' {skip();};
Main.java
import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.stringtemplate.*;
public class Main {
public static void main(String[] args) throws Exception {
String src = "var Query = [select * from table]; var x = 42;";
MyLanguageLexer lexer = new MyLanguageLexer(new ANTLRStringStream(src));
MyLanguageParser parser = new MyLanguageParser(new CommonTokenStream(lexer));
CommonTree tree = (CommonTree)parser.parse().getTree();
DOTTreeGenerator gen = new DOTTreeGenerator();
StringTemplate st = gen.toDOT(tree);
System.out.println(st);
}
}
运行演示
java -cp antlr-3.3.jar org.antlr.Tool MiniSQL.g
java -cp antlr-3.3.jar org.antlr.Tool MyLanguage.g
javac -cp antlr-3.3.jar *.java
java -cp .:antlr-3.3.jar Main
输入:
var Query = [select * from table]; var x = 42;
Main
类的输出对应于以下AST:
the output of the Main
class corresponds to the following AST:
如果您想在SQL中允许使用字符串文字(可能包含]
)和注释(可能包含'
和]
),则可以在您的SQL中使用以下SQL
规则主要语法:
And if you want to allow string literals inside your SQL (which could contain ]
), and comments (which could contain '
and ]
), the you could use the following SQL
rule inside your main grammar:
SQL
: '[' ( ~(']' | '\'' | '-')
| '-' ~'-'
| COMMENT
| STR
)*
']'
;
fragment STR
: '\'' (~('\'' | '\r' | '\n') | '\'\'')+ '\''
| '\'\''
;
fragment COMMENT
: '--' ~('\r' | '\n')*
;
可以在单个令牌中正确解析以下输入:
which would properly parse the following input in a single token:
[
select a,b,c
from table
where a='A''B]C'
and b='' -- some ] comment ] here'
]
请注意,尝试为整个SQL方言(甚至是很大的子集)创建语法并不是一件容易的事!您可能要搜索现有的SQL解析器,或查看ANTLR Wiki中的示例语法.
Just beware that trying to create a grammar for an entire SQL dialect (or even a large subset) is no trivial task! You may want to search for existing SQL parsers, or look at the ANTLR wiki for example-grammars.
这篇关于ANTLR是否可以在内部嵌入语法制作语法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!