问题描述
可以根据C语法(在CFG中描述)来解析C程序源代码,并最终将其转换为许多AST.我正在考虑是否存在这样的工具:它可以通过首先随机生成许多AST来做相反的事情,这些AST包括没有具体字符串值的令牌,只是根据CFG的令牌类型,然后生成具体的根据正则表达式中的标记定义进行标记.
A C program source code can be parsed according to the C grammar(described in CFG) and eventually turned into many ASTs. I am considering if such tool exists: it can do the reverse thing by firstly randomly generating many ASTs, which include tokens that don't have the concrete string values, just the types of the tokens, according to the CFG, then generating the concrete tokens according to the tokens' definitions in the regular expression.
我可以想象,第一步看起来像是一个迭代的非终端替换,它是随机的,并可能受到一定数量的迭代次数的限制.第二步只是根据正则表达式随机生成字符串.
I can imagine the first step looks like an iterative non-terminals replacement, which is randomly and can be limited by certain number of iteration times. The second step is just generating randomly strings according to regular expressions.
有没有工具可以做到这一点?
Is there any tool that can do this?
推荐答案
数据生成语言" DGL 可以做到这一点,并具有在输出的语法中加权生产概率的功能.
The "Data Generation Language" DGL does this, with the added ability to weight the probabilities of productions in the grammar being output.
通常,可以将递归下降解析器直接相当地重写为一组递归过程,以生成而不是解析/识别语言.
In general, a recursive descent parser can be quite directly rewritten into a set of recursive procedures to generate, instead of parse / recognise, the language.
这篇关于任何工具都可以根据语言语法随机生成源代码吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!