问题描述
我最近有一个无法加载的 XML.错误信息是
I recently had an XML which would not load. The error message was
十六进制值 0x00 是无效字符
被LinqPad中最少的代码接收到(C#语句):
received by the minimum of code in LinqPad (C# statements):
var xmlDocument = new XmlDocument();
xmlDocument.Load(@"C:\Users\Thomas\AppData\Local\Temp\tmp485D.tmp");
我使用十六进制编辑器浏览了 XML,但找不到 0x00 字符.我将 XML 最小化为
I went through the XML with a hex editor but could not find a 0x00 character. I minimized the XML to
<?xml version="1.0" encoding="UTF-8"?>
<x>
</x>
在我的十六进制编辑器中显示为
In my hex editor it shows up as
Offset(h) 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00000000 FF FE 3C 00 3F 00 78 00 6D 00 6C 00 20 00 76 00 ÿþ<.?.x.m.l. .v.
00000010 65 00 72 00 73 00 69 00 6F 00 6E 00 3D 00 22 00 e.r.s.i.o.n.=.".
00000020 31 00 2E 00 30 00 22 00 20 00 65 00 6E 00 63 00 1...0.". .e.n.c.
00000030 6F 00 64 00 69 00 6E 00 67 00 3D 00 22 00 55 00 o.d.i.n.g.=.".U.
00000040 54 00 46 00 2D 00 38 00 22 00 3F 00 3E 00 0D 00 T.F.-.8.".?.>...
00000050 0A 00 3C 00 78 00 3E 00 0D 00 0A 00 3C 00 2F 00 ..<.x.>.....<./.
00000060 78 00 3E 00 x.>.
所以很容易看出任何地方都没有 00 00
字符.所有偶数列都包含 00
以外的值.
So it's very easy to see that there is no 00 00
character anywhere. All even columns contain values other than 00
.
为什么它会抱怨无效的 0x00 字符?
Why does it complain about invalid 0x00 character?
推荐答案
问题出在编码上.字节顺序标记 FF FE
适用于 UTF-16,但 XML 标头定义了 encoding="UTF-8"
.
The problem is in the encoding. The byte order marks FF FE
are for UTF-16, but the XML header defines encoding="UTF-8"
.
如果您自己生成 XML,则有两种选择:
If you generate the XML yourself, there are two options:
a) 写一个 UTF-8 标头:EF BB BF
a) write a UTF-8 header: EF BB BF
b) 定义 UTF-16 编码:encoding="UTF-16"
b) define UTF-16 encoding: encoding="UTF-16"
如果您从其他人那里收到 XML,也有两种选择:
If you receive the XML from someone else, there are also two options:
A) 告诉作者根据 a) 或 b) 修复 XML
A) tell the author to fix the XML according a) or b)
B) 清理应用程序中的输入(非首选)
B) sanitize the input in your application (not preferred)
这篇关于十六进制值 0x00 是加载 XML 文档的无效字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!