使用lxml从python中的xml中删除命名空间和前缀

使用lxml从python中的xml中删除命名空间和前缀

本文介绍了使用lxml从python中的xml中删除命名空间和前缀的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 xml 文件,我需要打开并对其进行一些更改,其中一项更改是删除命名空间和前缀,然后保存到另一个文件.这是xml:

I have an xml file I need to open and make some changes to, one of those changes is to remove the namespace and prefix and then save to another file.Here is the xml:

<?xml version='1.0' encoding='UTF-8'?>
<package xmlns="http://apple.com/itunes/importer">
  <provider>some data</provider>
  <language>en-GB</language>
</package>

我可以进行我需要的其他更改,但不知道如何删除命名空间和前缀.这是我需要的 reusklt xml:

I can make the other changes I need, but can't find out how to remove the namespace and prefix. This is the reusklt xml I need:

<?xml version='1.0' encoding='UTF-8'?>
<package>
  <provider>some data</provider>
  <language>en-GB</language>
</package>

这是我的脚本,它将打开并解析 xml 并保存它:

And here is my script which will open and parse the xml and save it:

metadata = '/Users/user1/Desktop/Python/metadata.xml'
from lxml import etree
parser = etree.XMLParser(remove_blank_text=True)
open(metadata)
tree = etree.parse(metadata, parser)
root = tree.getroot()
tree.write('/Users/user1/Desktop/Python/done.xml', pretty_print = True, xml_declaration = True, encoding = 'UTF-8')

那么如何在我的脚本中添加代码以删除命名空间和前缀?

So how would I add code in my script which will remove the namespace and prefix?

推荐答案

按照 Uku Loskit 的建议替换标签.除此之外,使用 lxml.objectify.deannotate.

Replace tag as Uku Loskit suggests. In addition to that, use lxml.objectify.deannotate.

from lxml import etree, objectify

metadata = '/Users/user1/Desktop/Python/metadata.xml'
parser = etree.XMLParser(remove_blank_text=True)
tree = etree.parse(metadata, parser)
root = tree.getroot()

####
for elem in root.getiterator():
    if not hasattr(elem.tag, 'find'): continue  # (1)
    i = elem.tag.find('}')
    if i >= 0:
        elem.tag = elem.tag[i+1:]
objectify.deannotate(root, cleanup_namespaces=True)
####

tree.write('/Users/user1/Desktop/Python/done.xml',
           pretty_print=True, xml_declaration=True, encoding='UTF-8')

更新

一些标签如Comment 在访问tag 属性时返回一个函数.为此添加了一个警卫.(1)

Some tags like Comment return a function when accessing tag attribute. added a guard for that. (1)

这篇关于使用lxml从python中的xml中删除命名空间和前缀的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-06 07:23