问题描述
我正在用 C# 编写一个 GIS 客户端工具,以从服务器检索基于 GML 的 XML 模式(下面的示例)中的特征".提取限制为 100,000 个特征.
I'm writing a GIS client tool in C# to retrieve "features" in a GML-based XML schema (sample below) from a server. Extracts are limited to 100,000 features.
我估计最大的 extract.xml 可能会达到 150 兆字节左右,所以很明显 DOM 解析器已经过时了,我一直试图在 XmlSerializer 和 XSD.EXE 生成的绑定 --OR-- XmlReader 和一个手工制作的对象图.
I guestimate that the largest extract.xml might get up around 150 megabytes, so obviously DOM parsers are out I've been trying to decide between XmlSerializer and XSD.EXE generated bindings --OR-- XmlReader and a hand-crafted object graph.
或者也许有更好的方法我还没有考虑过?像 XLINQ,或者 ????
Or maybe there's a better way which I haven't considered yet? Like XLINQ, or ????
请问有人可以指导我吗?特别是关于任何给定方法的内存效率.如果不是,我将不得不对这两个解决方案进行原型设计"并并排分析它们.
Please can anybody guide me? Especially with regards to the memory efficiency of any given approach. If not I'll have to "prototype" both solutions and profile them side-by-side.
我有点像 .NET 中的大虾.任何指导将不胜感激.
I'm a bit of a raw prawn in .NET. Any guidance would be greatly appreciated.
谢谢.基思.
示例 XML - 其中最多 100,000 个,每个功能最多 234,600 个坐标.
Sample XML - upto 100,000 of them, of upto 234,600 coords per feature.
<feature featId="27168306" fType="vegetation" fTypeId="1129" fClass="vegetation" gType="Polygon" ID="0" cLockNr="51598" metadataId="51599" mdFileId="NRM/TIS/VEGETATION/9543_22_v3" dataScale="25000">
<MultiGeometry>
<geometryMember>
<Polygon>
<outerBoundaryIs>
<LinearRing>
<coordinates>153.505004,-27.42196 153.505044,-27.422015 153.503992 .... 172 coordinates omitted to save space ... 153.505004,-27.42196</coordinates>
</LinearRing>
</outerBoundaryIs>
</Polygon>
</geometryMember>
</MultiGeometry>
</feature>
推荐答案
使用 XmlReader
来解析大型 XML 文档.XmlReader
提供对 XML 数据的快速、只进、非缓存访问.(仅向前意味着您可以从头到尾读取 XML 文件,但不能在文件中向后移动.)XmlReader
使用少量内存,相当于使用简单的 SAX 阅读器.>
Use XmlReader
to parse large XML documents. XmlReader
provides fast, forward-only, non-cached access to XML data. (Forward-only means you can read the XML file from beginning to end but cannot move backwards in the file.) XmlReader
uses small amounts of memory, and is equivalent to using a simple SAX reader.
using (XmlReader myReader = XmlReader.Create(@"c:datacoords.xml"))
{
while (myReader.Read())
{
// Process each node (myReader.Value) here
// ...
}
}
您可以使用 XmlReader 处理最大为 2 GB 的文件.
You can use XmlReader to process files that are up to 2 gigabytes (GB) in size.
这篇关于在 C# 代码中解析(大)XML 的最佳方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!