在 C# 代码中解析(大)XML 的最佳方法是什么?

本文介绍了在 C# 代码中解析(大)XML 的最佳方法是什么?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在用 C# 编写一个 GIS 客户端工具，以从服务器检索基于 GML 的 XML 模式(下面的示例)中的特征".提取限制为 100,000 个特征.

I'm writing a GIS client tool in C# to retrieve "features" in a GML-based XML schema (sample below) from a server. Extracts are limited to 100,000 features.

我估计最大的 extract.xml 可能会达到 150 兆字节左右，所以很明显 DOM 解析器已经过时了，我一直试图在 XmlSerializer 和 XSD.EXE 生成的绑定 --OR-- XmlReader 和一个手工制作的对象图.

I guestimate that the largest extract.xml might get up around 150 megabytes, so obviously DOM parsers are out I've been trying to decide between XmlSerializer and XSD.EXE generated bindings --OR-- XmlReader and a hand-crafted object graph.

或者也许有更好的方法我还没有考虑过?像 XLINQ，或者 ????

Or maybe there's a better way which I haven't considered yet? Like XLINQ, or ????

请问有人可以指导我吗?特别是关于任何给定方法的内存效率.如果不是，我将不得不对这两个解决方案进行原型设计"并并排分析它们.

Please can anybody guide me? Especially with regards to the memory efficiency of any given approach. If not I'll have to "prototype" both solutions and profile them side-by-side.

我有点像 .NET 中的大虾.任何指导将不胜感激.

I'm a bit of a raw prawn in .NET. Any guidance would be greatly appreciated.

谢谢.基思.

示例 XML - 其中最多 100,000 个，每个功能最多 234,600 个坐标.

Sample XML - upto 100,000 of them, of upto 234,600 coords per feature.

<feature featId="27168306" fType="vegetation" fTypeId="1129" fClass="vegetation" gType="Polygon" ID="0" cLockNr="51598" metadataId="51599" mdFileId="NRM/TIS/VEGETATION/9543_22_v3" dataScale="25000">
  <MultiGeometry>
    <geometryMember>
      <Polygon>
        <outerBoundaryIs>
          <LinearRing>
            <coordinates>153.505004,-27.42196 153.505044,-27.422015 153.503992 .... 172 coordinates omitted to save space ... 153.505004,-27.42196</coordinates>
          </LinearRing>
        </outerBoundaryIs>
      </Polygon>
    </geometryMember>
  </MultiGeometry>
</feature>

推荐答案

使用 XmlReader 来解析大型 XML 文档.XmlReader 提供对 XML 数据的快速、只进、非缓存访问.(仅向前意味着您可以从头到尾读取 XML 文件，但不能在文件中向后移动.)XmlReader 使用少量内存，相当于使用简单的 SAX 阅读器.>

Use XmlReader to parse large XML documents. XmlReader provides fast, forward-only, non-cached access to XML data. (Forward-only means you can read the XML file from beginning to end but cannot move backwards in the file.) XmlReader uses small amounts of memory, and is equivalent to using a simple SAX reader.

    using (XmlReader myReader = XmlReader.Create(@"c:datacoords.xml"))
    {
        while (myReader.Read())
        {
           // Process each node (myReader.Value) here
           // ...
        }
    }

您可以使用 XmlReader 处理最大为 2 GB 的文件.

You can use XmlReader to process files that are up to 2 gigabytes (GB) in size.

参考:如何使用 Visual C# 从文件中读取 XML

这篇关于在 C# 代码中解析(大)XML 的最佳方法是什么?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！

Net