


I want to generate sentences randomly from a given context-free grammar.

随机是重要的部分,因为我的语法很大,并且 NLTK 会生成所有与递归均不符的可能话语(即 E-> AE ),并且花费的时间太长,无法在短时间内生成有趣的"话语(有趣的是,与当前话语之前的其他话语不同).

Randomly is the important part because my grammar is quite large, and NLTK generates all the possible utterances which falls short on recursions (i.e. E -> A E) and takes too long to generate "interesting" utterances in short time (interesting being unlike the other utterances preceding the current one).


Are there any Python libraries for that? Thanks!



NLTK doesn't provide a method to generate random sentences from a grammar, although as indicated in this related SO question, How to use NLTK to generate sentences from an induced grammar?, it can generate random sentences from trigrams.

如果要编写自己的Python函数,您可能会对Bruce Mackenzie在1997年发表的这篇论文感兴趣,从上下文无关语法随机生成字符串.(我在此答案是对另一个SO问题的答案中找到了链接.)该算法涉及在O(N )的预处理步骤,并要求语法中没有epsilon产生式(扩展为空字符串的产生式).

If you want to write your own Python function, you might be interested in this 1997 paper by Bruce Mackenzie, Generating Strings at Random from a Context Free Grammar. (I found the link in this answer to a different SO question.) The algorithm involves precomputing weights in an O(N) preprocessing step, and requires that the grammar have no epsilon productions (productions that expand to the empty string).


08-03 18:03