问题描述
我已经使用解析XML两种以下两种方法...
I have parsed XML using both of the following two methods...
- 使用对象模型和XPath查询解析XmlDocument的。
- 在XSL / T
不过,我从来没有使用过...
But I have never used...
- 这是新的LINQ的XML对象模型,.NET 3.5
谁能告诉我三个备选方案之间的比较效益?
Can anyone tell me the comparative efficiency between the three alternatives?
我意识到特定使用情况将是一个因素,但我只想要一个大概的了解。例如,是LINQ的选项比其他大规模慢?
I realise that the particular usage would be a factor, but I just want a rough idea. For example, is the Linq option massively slower than the others?
推荐答案
要查询的XML文档,绝对最快的方式是最难的:编写使用一个XmlReader来处理输入流的方法,并将其工艺节点,因为它读他们。这是为了解析和查询合并为一个单一的操作的方式。 (只需使用XPath没有做到这一点。这两个的XmlDocument和XPathDocument中解析它们的负载方法的文件),这是通常只有一个好主意,如果你正在处理XML数据的非常大的数据流。
The absolute fastest way to query an XML document is the hardest: write a method that uses an XmlReader to process the input stream, and have it process nodes as it reads them. This is the way to combine parsing and querying into a single operation. (Simply using XPath doesn't do this; both XmlDocument and XPathDocument parse the document in their Load methods.) This is usually only a good idea if you're processing extremely large streams of XML data.
您已经描述了这三种方法执行与此类似。 XSLT有很大的空间来最慢的很多,因为它可以让你结合的XPath的效率与模板匹配的低效率。 XPath和LINQ查询这两个基本上做同样的事情,这是线性的,通过XML节点的可枚举列表搜索。我希望LINQ到相对较快的做法,因为XPath是PTED在运行时除$ P $,而LINQ是PTED在编译时除$ P $。
All three methods you've describe perform similarly. XSLT has a lot of room to be the slowest of the lot, because it lets you combine the inefficiencies of XPath with the inefficiencies of template matching. XPath and LINQ queries both do essentially the same thing, which is linear searching through enumerable lists of XML nodes. I would expect LINQ to be marginally faster in practice because XPath is interpreted at runtime while LINQ is interpreted at compile-time.
但在一般情况下,你如何编写查询将有比你用什么技术对运行速度产生更大的影响。
But in general, how you write your query is going to have a much greater impact on execution speed than what technology you use.
您是否使用XPath或LINQ来对XML文档编写快速查询的方法是一样的。这不要紧,你使用的技术:它检查每个节点在文档中的查询将运行了很多比一个检查其中只有一小部分比较慢。你有能力做到这一点是更依赖于XML比什么都重要的结构:与元素的通航等级的文档通常会快不少不止一个,其元素是文档元素的子查询
The way to write fast queries against XML documents is the same whether you're using XPath or LINQ: formulate the query so that as few nodes as possible get visited during its execution. It doesn't matter which technology you use: a query that examines every node in the document is going to run a lot slower than one that examines only a small subset of them. Your ability to do that is more dependent on the structure of the XML than anything else: a document with a navigable hierarchy of elements is generally going to be a lot faster to query than one whose elements are all children of the document element.
编辑:
虽然我pretty的肯定我是对的,要查询XML绝对的最快方式是最难的,真正的最快(和最困难的)方法不使用的XmlReader
;它使用一个状态机,直接从流处理的字符。像解析XML定期EX pressions,这通常是一个可怕的想法。但它确实让你交换功能速度的选项。通过决定不处理这些XML片段,你并不需要为您的应用程序(如空间分辨率,扩展字符实体等),你可以建立一些能够通过字符流的追求比一个快的XmlReader
会。我能想到的应用,这甚至不是一个坏主意,虽然我不认为很多。
While I'm pretty sure I'm right that the absolute fastest way to query an XML is the hardest, the real fastest (and hardest) way doesn't use an XmlReader
; it uses a state machine that directly processes characters from a stream. Like parsing XML with regular expressions, this is ordinarily a terrible idea. But it does give you the option of exchanging features for speed. By deciding not to handle those pieces of XML that you don't need for your application (e.g. namespace resolution, expansion of character entities, etc.) you can build something that will seek through a stream of characters faster than an XmlReader
would. I can think of applications where this is even not a bad idea, though there I can't think of many.
这篇关于什么是用于解析XML,XPath和XmlDocuments,XSLT或Linq的更有效率?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!