本文介绍了lxml删除<?xml ...>标签时解析?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在解析XML文档(添加元素,添加属性等).因此,在处理XML之前,我首先需要解析XML.但是,lxml似乎正在删除元素<?xml ...>.例如

I'm currently working with parsing XML documents (adding elements, adding attributes, etc). So I first need to parse the XML in before working on it. However, lxml seems to be removing the element <?xml ...>. For example

from lxml import etree

tree = etree.fromstring('<?xml version="1.0" encoding="utf-8"?><dmodule>test</dmodule>', etree.XMLParser())
print etree.tostring(tree)

将导致

<dmodule>test</dmodule>

有人知道为什么要删除<?xml ...>元素吗?我认为编码标签是有效的XML.谢谢您的时间.

Does anyone know why the <?xml ...> element is being removed? I thought encoding tags were valid XML. Thanks for your time.

推荐答案

<?xml>元素是XML声明,因此严格来说不是元素.它只是提供有关其下的XML树的信息.

The <?xml> element is an XML declaration, so it's not strictly an element. It just gives info about the XML tree below it.

如果您需要使用lxml将其打印出来,这里有一些有关您可以使用的xmlDeclaration=TRUE标志的信息.

If you need to print it out with lxml, there is some info here about the xmlDeclaration=TRUE flag you can use.

http://lxml.de/api.html#serialisation

etree.tostring(tree, xml_declaration=True)

这篇关于lxml删除&lt;?xml ...&gt;标签时解析?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-26 14:56