问题描述
我正在使用lxml读取我的xml文件.我正在使用类似下面的代码.它在lxml2.3 beta1上可以正常工作,但是在lxml2.3上可以给我zn xml语法错误,如下所示.我浏览了两个版本的发行说明,但无法弄清楚是什么原因引起了该错误或如何修复该错误.如果您遇到这样的事情或有任何线索,请提供帮助.
I am using lxml to read my xml file. I am using a code something like below. It works just fine with lxml2.3 beta1, but with lxml2.3 it gives me zn xml syntax error as shown below. I went through the release notes for both versions, but could not figure out what could have caused this error or how to fix it. Please help if you have come across such a thing or have any clues about it.
谢谢!
代码:
from lxml import etree
def parseXml(context,attribList,elemList):
for event, element in context:
if element.tag in elemList:
#read element attributes
element.clear()
def main(object):
ns='{NS}'
attribList=['name','age','id']
elemList=[ns+'Employee',ns+'Experience',ns+'Employment',ns+'Project',ns+'Award']
context=etree.iterparse(fullFilePath, events=("start","end"))
parseXml(context,attribList,elemList)
错误:
xml示例-
<root xmlns='NS'>
<Employee Name="Mr.ZZ" Age="30">
<Experience TotalYears="10" StartDate="2000-01-01" EndDate="2010-12-12">
<Employment id = "1" EndTime="ABC" StartDate="2000-01-01" EndDate="2002-12-12">
<Project Name="ABC_1" Team="4">
</Project>
</Employment>
<Employment id = "2" EndTime="XYZ" StartDate="2003-01-01" EndDate="2010-12-12">
<PromotionStatus>Manager</PromotionStatus>
<Project Name="XYZ_1" Team="7">
<Award>Star Team Member</Award>
</Project>
</Employment>
</Experience>
</Employee>
</root>
雇员"在根中重复出现.在解析器正确地通过了许多员工之后,就会发生错误.
The 'Employee' are repeated within the root. And the error happens after the parser has gone though many of the employees correctly.
修改1:在捕获异常时,我捕获了以下内容:
Edit 1:On capturing the exception, I catch the following:
WARNING:NAMESPACE:NS_ERR_UNDEFINED_NAMESPACE: Namespace default prefix was not found
推荐答案
好,所以我终于知道发生了什么.在清理掉用过的元素的好建议之后,我正在清理所有元素,包括根节点.根节点是具有默认名称空间前缀的节点,适用于该根节点内的所有节点.由于清除了根节点,因此默认名称空间前缀不再是其子元素nsmap的一部分.在此意义上,以前的版本似乎可以原谅,但最新的版本更为严格.
Ok, so I finally figured out what was going on. Following a good advice to clean up used elements, I was clearing up all the elements, including the root node. The root node is the one with the default namespace prefix which applies to all nodes within that root. Since I cleared off my root node, the default namespace prefix was no longer a part of the nsmap of its subelements. The previous versions seem to be forgiving of this but the latest version was more strict in this sense.
直到我读完xml才清除根元素才对我有用.
Not clearing the root element untill I was done reading the xml did the trick for me.
这篇关于lxml XMLSyntaxError:找不到命名空间默认前缀的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!