问题描述
我想检索旧的xml文件,进行操作并保存。
这是我的代码:
从xml.etree导入cElementTree作为ET
NS = {http://www.somedomain.com/XI/Traffic/10}
def fix_xml(文件名):
f = ET.parse(文件名)
root = f.getroot()
eventlist = root.findall(%(ns)Event%{' ns':NS})
xpath =%(ns)sEventDetail /%(ns)sEventDescription%{'ns':NS}
用于事件列表中的事件:
desc =事件。 find(xpath)
desc.text = desc.text.upper()#对文本进行一些编辑。
ET.ElementTree(root,nsmap = NS).write( out.xml,encoding = utf-8)
short_xml( test.xml)
我加载的文件包含:
xmlns = http://www.somedomain.com/XI/Traffic/10
xmlns:xsi = http://www.w3。 org / 2001 / XMLSchema-instance
xsi:schemaLocation = http://www.somedomain.com/XI/Traffic/10 10.xds
在根标记处。
我遇到以下与命名空间有关的问题:
- 如您所见,对于每个标记调用,我在开始检索孩子时就给了命名空间。
- 开头的生成的xml文件没有
<?xml version = 1.0 encoding = utf-8?>
。 - 输出中的标签包含这样的
< ns0:eventDescription>
,而我需要将输出作为原始的< eventDescription>
,开头没有命名空间。
看看。也是。
问题1:像其他所有人一样忍受。代替%(ns)Event%{'ns':NS}
尝试 NS + Event
。 / p>
问题2:默认情况下,仅在需要时才编写XML声明。您可以在 write()
调用中使用 xml_declaration = True
强制执行此操作(仅lxml)。
问题3: nsmap
arg似乎仅适用于lxml。 AFAICT它需要MAPping,而不是字符串。尝试 nsmap = {None:NS}
。 effbot文章的一节描述了解决方法。
I want to retrieve a legacy xml file, manipulate and save it.
Here is my code:
from xml.etree import cElementTree as ET
NS = "{http://www.somedomain.com/XI/Traffic/10}"
def fix_xml(filename):
f = ET.parse(filename)
root = f.getroot()
eventlist = root.findall("%(ns)Event" % {'ns':NS })
xpath = "%(ns)sEventDetail/%(ns)sEventDescription" % {'ns':NS }
for event in eventlist:
desc = event.find(xpath)
desc.text = desc.text.upper() # do some editting to the text.
ET.ElementTree(root, nsmap=NS).write("out.xml", encoding="utf-8")
shorten_xml("test.xml")
The file I load contains:
xmlns="http://www.somedomain.com/XI/Traffic/10"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.somedomain.com/XI/Traffic/10 10.xds"
at the root tag.
I have the following problems, related to namespace:
- As you see, for each tag call, I have give the namespace at the begining to retreive a child.
- Generated xml file doesn't have
<?xml version="1.0" encoding="utf-8"?>
at the begining. - The tags at the output contains such
<ns0:eventDescription>
while I need output as the original<eventDescription>
, without namespace at the begining.
How can these be solved?
Have a look at the lxml tutorial section on namespaces. Also this article about namespaces in ElementTree.
Problem 1: Put up with it, like everybody else does. Instead of "%(ns)Event" % {'ns':NS }
try NS+"Event"
.
Problem 2: By default, the XML declaration is written only if it is required. You can force it (lxml only) by using xml_declaration=True
in your write()
call.
Problem 3: The nsmap
arg appears to be lxml-only. AFAICT it needs a MAPping, not a string. Try nsmap={None: NS}
. The effbot article has a section describing a workaround for this.
这篇关于Python:xml ElementTree(或lxml)中的名称空间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!