问题描述
我有这个XML文件称为 xmltest.xml
:
I have this XML file, called xmltest.xml
:
<?xml version="1.0" encoding="GBK"?>
<productMeta>
<bands>1,2,3,4</bands>
<imageName>TestName.tif</imageName>
<browseName>TestName.jpg</browseName>
</productMeta>
我有这个Python虚拟代码:
And I have this Python dummy code:
import xml.etree.ElementTree as ET
xmldoc = ET.parse('xmltest.xml')
但它引发了一个 ValueError
:
我明白这个错误,因为在第一行XML文件。 XML文件是UTF-8编码的,但始终具有该声明(我不是要分析的XML文件的创建者)。解析前一个XML文件时,如何避免这样的编码声明?
I understand this error, it raises because the encoding declaration in the first line of the XML file. The XML file is UTF-8 encoded but always have that declaration (I'm not the creator of the XML files to be analyzed). How can I avoid such encoding declaration when parsing an XML file such the former one?
推荐答案
对于我来说,打开 xml
文件作为文件对象,然后使用 ElementTree.fromstring()
传递完整的
One thing that I tried, that worked for me is to open the xml
file as a file object , then use ElementTree.fromstring()
passing in the complete contents of the file.
示例 -
>>> import xml.etree.ElementTree as ET
>>> ef = ET.parse('a.xml')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python34\lib\xml\etree\ElementTree.py", line 1187, in parse
tree.parse(source, parser)
File "C:\Python34\lib\xml\etree\ElementTree.py", line 598, in parse
self._root = parser._parse_whole(source)
ValueError: multi-byte encodings are not supported
>>> with open('a.xml','r') as f:
... ef = ET.fromstring(f.read())
...
>>> ef
<Element 'productMeta' at 0x028DF180>
您还可以创建一个 XMLParser
所需的编码,这应该使您能够从该编码解析字符串。示例 -
You can also, create an XMLParser
with the required encoding, and this should enable you to be able to parse strings from that encoding, Example -
import xml.etree.ElementTree as ET
xmlp = ET.XMLParser(encoding="utf-8")
f = ET.parse('a.xml',parser=xmlp)
这篇关于如何使用Python中的编码声明解析XML文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!