问题描述
可以使用Python从XML文件生成DTD吗?
Can a DTD be generated from an XML file using Python?
推荐答案
对您所提问题的简单答案是是的,可以使用Python从XML文档生成DTD。
The simple answer to the question you ask is "yes, a DTD can be generated from an XML document using Python".
Python是图灵完备的语言,并且有一些算法可以从任意XML或SGML集合生成DTD。我相信标准参考文献是Rick Kazman,通过
有限状态转换构造牛津英语词典的文本,新牛津英语
词典技术中心。报告OED-86-20,大学。 (1986年6月),第117页。
Python is a Turing-complete language, and there are algorithms for generating a DTD from any arbitrary collection of XML or SGML. I believe the standard reference is Rick Kazman, "Structuring the text of the Oxford English Dictionary throughfinite state transduction," Centre for the New Oxford EnglishDictionary Tech. Report OED-86-20, Univ. of Waterloo (June 1986), 117 pp.
在1980年代后期,图书馆财团OCLC开发了一种名为Fred的工具,该工具为SGML文件的主体引入了DTD。我非正式地听到了很多关于它的信息,但不记得曾经见过有关它的算法的公开描述。但是,在Web上快速搜索 OCLC Fred SGML DTD会产生指向。 (快速浏览显示了很多内容,但是我没有看到对所使用算法的高级描述的明确引用。)
In the late 1980s, the library consortium OCLC developed a tool called Fred, which induced DTD for bodies of SGML documents; I heard a lot about it informally but do not recall ever seeing published descriptions of its algorithms. However, a quick search of the Web for "OCLC Fred SGML DTD" produces a pointer to Keith E. Shafer, Fred: the SGML Grammar Builder (1996). (A quick glance showed a great deal of material, but I did not see any clear reference to a high-level description of the algorithms used.)
还有一个1994年的挪威论文:Sunniva MK Solstrand,自动机生成SGML-kodet材质的DTD,Hovedfagsoppgave i notifyasjonsvitenskap,Universitetet i Bergen,1994年。)
There is also a Norwegian thesis from 1994: Sunniva M. K. Solstrand, "Automatisk generering av DTD fra SGML-kodet materiale", Hovedfagsoppgave i informasjonsvitenskap, Universitetet i Bergen 1994).
,有几位不同意评论员的计算机科学家告诉您您的问题是毫无意义或错误的。当然,通过自动语法归纳实现的文档语法质量往往低于人类文档分析员和DTD编写者所实现的文档语法质量。
As may be seen, there are several computer scientists who do not agree with the commenters who have told you your question is pointless or wrong. It is true, of course, that the quality of document grammar achieved by automatic grammar induction tends to be lower than the quality of document grammar achieved by a human document analyst and DTD writer.
我怀疑如果将DTD限制在Fabio Vitali及其合作者在博洛尼亚的各种文章中所描述的内容模型中,那么它可能会更合理。我认为,最初的论文是,Extreme Markup Languages 2005,以及以后的论文都详细阐述和描述了应用程序。 Francesco Poggi在博洛尼亚的新作品(尚未出版)扩展并加深了分析。对 XML设计模式的Web搜索可能会提供类似语法模式集的其他尝试。从语法归纳的角度来看,这种模式的效果是通过针对简单的语法来降低归纳问题的复杂性。
I suspect that the DTD generated would be more plausible if it restricted itself to the content-model patterns described in various articles by Fabio Vitali and his collaborators in Bologna. The initial paper was, I believe, Fabio Vitali, Angelo Di Iorio, and Daniele Gubellini, "Design patterns for descriptive document substructures", Extreme Markup Languages 2005, and later papers have elaborated and described applications. New work in Bologna by Francesco Poggi (not yet published) extends and deepens the analysis. A Web search for "XML design patterns" may provide other attempts at similar sets of grammatical patterns. From a grammar-induction point of view, the effect of such patterns is to reduce the complexity of the induction problem by targeting simpler grammars.
如果您要问一个非常不同的问题有人可以推荐一个基于Python的工具来从XML文档生成DTD吗?,那么我无济于事(而且有很多Stack Overflow主持人会立即关闭该问题,因为询问工具建议的问题被忽略了。
If you meant to ask the rather different question "Can anyone recommend a Python-based tool for generating a DTD from an XML document?", then I can't help you (and there are lots of Stack Overflow moderators who will close the question at once because questions asking for tool recommendations are frowned upon).
这篇关于如何从XML生成DTD?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!