问题描述
我正在寻找最新的,内存效率高且高性能的Java XML解析API.我需要解析3 MB到5 MB的XML文件.
I am looking for latest, memory efficient and high-performance java XML parsing API.I need to parse 3 MB to 5 MB XML files.
我在Google上进行了搜索,并了解了Sun Java Streaming XML Parser(SJSXP),Woodstox的速度比DOM& SAX.两者都使用StAX API.*这些技术不支持模式验证.
I did google on this and come to know about Sun Java Streaming XML Parser (SJSXP) and Woodstox is much faster than DOM & SAX. Both are using StAX API.*schema validation is not supported by these technologies.
Aalto XML处理器还实现了StAX API.
Aalto XML processor is also implements StAX API.
我还没有找到有关这些技术性能的具体发现.
I have not found concrete findings on performance on these technologies.
在内存高效,高性能和易用性方面,哪一个是最佳选择?
Which one will be best in context of memory efficient, high-performance and ease of use ?
推荐答案
以下是一些可能相关的链接:
Here are some more links that might be relevant:
- Stax表示数据绑定: http://technotes.blogs.sapo.pt/1708 .html
- 有效地使用Woodstox: http://www.cowtowncoder.com/blog/archives/2006/06/entry_2.html
- 使用Woodstox加速XSLT: http://www.cowtowncoder. com/blog/archives/2009/04/entry_235.html
- Stax impls for data-binding: http://technotes.blogs.sapo.pt/1708.html
- Using Woodstox efficiently: http://www.cowtowncoder.com/blog/archives/2006/06/entry_2.html
- Speeding up XSLT with Woodstox: http://www.cowtowncoder.com/blog/archives/2009/04/entry_235.html
关于性能:SJSXP是最慢的;它只是Xerces的内部重新包装,包装在Stax API中.这对性能有一些负面影响(因为它不是真正为拉解析设计的). Woodstox更快一些.小文档和书写的速度更快,解析较长文档时的差异也较小.
As to performance: SJSXP is the slowest; it's just a repackage internals of Xerces, wrapped in Stax API. This has some negative effects on performance (since it's not really designed for pull parsing). Woodstox is bit faster; much faster for small documents and writing, less difference when parsing longer documents.
而Aalto则是这三者中最快的,尤其是对于解析而言.通常比Woodstox或SJSXP快50%-100%.缺点之一是它不处理DTD(因此不处理外部实体;它处理预定义和字符实体).
And Aalto is by far fastest of the three, especially for parsing. It is commonly 50% - 100% faster than either Woodstox or SJSXP. One downside is that it does not handle DTDs (and thereby not external entities; it handles pre-defined and character entities).
免责声明:我是Woodstox和Aalto的作者;以及SJSXP的贡献者(错误修复)
Disclaimer: I am author of Woodstox and Aalto; as well as contributor to SJSXP (bug fixes)
这篇关于Java-XML解析器性能:Sun Java Streaming XML解析器(SJSXP)与Woodstox的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!