问题描述
使用 Java DOM 解析器解析 XML 文件的结果:
Parsing an XML file using the Java DOM parser results in:
[Fatal Error] os__flag_8c.xml:103:135: An invalid XML character (Unicode: 0xc) was found in the element content of the document.
org.xml.sax.SAXParseException: An invalid XML character (Unicode: 0xc) was found in the element content of the document.
at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at javax.xml.parsers.DocumentBuilder.parse(Unknown Source)
推荐答案
有些字符不允许出现在 XML 文档中,即使您将数据封装在 CDATA 块中也是如此.
There are a few characters that are dissallowed in XML documents, even when you encapsulate data in CDATA-blocks.
如果您生成了文档,则需要对它进行将其删除.如果你有一个错误的文档,你应该在尝试解析它之前去掉这些字符.
If you generated the document you will need to strip it out. If you have an errorneous document, you should strip away these characters before trying to parse it.
在此线程中查看 dolmens 答案:XML 中的无效字符
See dolmens answer in this thread: Invalid Characters in XML
他指向本文的链接:http://www.w3.org/TR/xml/#charsets
基本上,除 0x9 (TAB)、0xA (CR?)、0xD (LF?) 外,0x20 以下的所有字符都是不允许的
Basically, all characters below 0x20 is disallowed, except 0x9 (TAB), 0xA (CR?), 0xD (LF?)
这篇关于发现无效的 XML 字符 (Unicode: 0xc)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!