问题描述
对于具有易于使用且可配置的 api 的出色解析器有什么想法吗?我希望向它提供数据,例如 http://wikitravel.org/wiki/en/api.php?format=xml&action=parse&prop=wikitext&page=San%20Francisco,选择我想要的数据部分,并为每种唯一类型的元素输出自定义 html?Java 将是首选,但如果有一个与大多数(99%+)wikitext 兼容的 php/js 解决方案,那也没关系.
Any ideas for a nice parser with an easy to use api that is configurable? I'm looking to feed it data such as http://wikitravel.org/wiki/en/api.php?format=xml&action=parse&prop=wikitext&page=San%20Francisco, choose sections of data I want, and output custom html for each unique type of element? Java would be preferred, but if there's a php/js solution that is compatible with most (99%+) wikitext, that would be okay as well.
推荐答案
Sweble 可能是维基文本最好的 Java 解析器.它声称与维基文本 100% 兼容,但我严重怀疑这一点.它将 wikitext 解析为抽象的语法树,然后您必须对其进行处理(例如将其转换为 HTML).
Sweble is probably the best Java parser of wikitext. It claims to be 100% compliant with wikitext, but I seriously doubt that. It parses wikitext into an abstract syntax tree that you then have to do something with (like convert it to HTML).
mediawiki.org 上有一个页面 列出了各种编程语言的维基文本解析器.我认为他们中的任何一个都没有完成 99+% 的维基文本.一般来说,解析维基文本是一个非常复杂的问题.Wikitext 甚至没有在 MediaWiki 解析器本身之外的任何地方正式定义.
There is a page on mediawiki.org that lists wikitext parsers in various programming languages. I don't think any of them do 99+% of wikitext though. In general parsing wikitext is a really complex problem. Wikitext isn't even formally defined anywhere outside of the MediaWiki parser itself.
这篇关于Java 维基文本解析器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!