问题描述
我目前正在解析XML文档(添加元素,添加属性等).因此,在处理XML之前,我首先需要解析XML.但是,lxml
似乎正在删除元素<?xml ...>
.例如
I'm currently working with parsing XML documents (adding elements, adding attributes, etc). So I first need to parse the XML in before working on it. However, lxml
seems to be removing the element <?xml ...>
. For example
from lxml import etree
tree = etree.fromstring('<?xml version="1.0" encoding="utf-8"?><dmodule>test</dmodule>', etree.XMLParser())
print etree.tostring(tree)
将导致
<dmodule>test</dmodule>
有人知道为什么要删除<?xml ...>
元素吗?我认为编码标签是有效的XML.谢谢您的时间.
Does anyone know why the <?xml ...>
element is being removed? I thought encoding tags were valid XML. Thanks for your time.
推荐答案
<?xml>
元素是XML声明,因此严格来说不是元素.它只是提供有关其下的XML树的信息.
The <?xml>
element is an XML declaration, so it's not strictly an element. It just gives info about the XML tree below it.
如果您需要使用lxml将其打印出来,这里有一些有关您可以使用的xmlDeclaration=TRUE
标志的信息.
If you need to print it out with lxml, there is some info here about the xmlDeclaration=TRUE
flag you can use.
http://lxml.de/api.html#serialisation
etree.tostring(tree, xml_declaration=True)
这篇关于lxml删除<?xml ...>标签时解析?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!