良好的python XML解析器，可处理名称空间繁重的文档

lxml is namespace-aware.>>> from lxml import etree>>> et = etree.XML("""<root xmlns="foo" xmlns:stuff="bar"><bar><stuff:baz /></bar></root>""")>>> etree.tostring(et, encoding=str) # encoding=str only needed in Python 3, to avoid getting bytes'<root xmlns="foo" xmlns:stuff="bar"><bar><stuff:baz/></bar></root>'>>> et.xpath("f:bar", namespaces={"b":"bar", "f": "foo"})[<Element {foo}bar at ...>]在您的示例中:from lxml import etree# remove the b prefix in Python 2# needed in python 3 because# "Unicode strings with encoding declaration are not supported."et = etree.XML(b"""...""")ns = { 'lom': 'http://ltsc.ieee.org/xsd/LOM', 'zs': 'http://www.loc.gov/zing/srw/', 'dc': 'http://purl.org/dc/elements/1.1/', 'voc': 'http://www.schooletc.co.uk/vocabularies/', 'srw_dc': 'info:srw/schema/1/dc-schema'}# according to docs, .xpath returns always lists when querying for elements# .find returns one element, but only supports a subset of XPathrecord = et.xpath("zs:records/zs:record", namespaces=ns)[0]# in this example, we know there's only one record# but else, you should apply the following to all elements the above returnsname = record.xpath("//voc:name", namespaces=ns)[0].textprint("name:", name)lom_entry = record.xpath("zs:recordData/srw_dc:dc/" "lom:metaMetadata/lom:identifier/" "lom:entry", namespaces=ns)[0].textprint('lom_entry:', lom_entry)lom_ids = [id.text for id in record.xpath("zs:recordData/srw_dc:dc/" "lom:classification/lom:taxonPath/" "lom:taxon/lom:id", namespaces=ns)]print("lom_ids:", lom_ids)输出:name: Frank Malinalom_entry: 2.6lom_ids: ['PYTHON', 'XML', 'XML-NAMESPACES'] 这篇关于良好的python XML解析器，可处理名称空间繁重的文档的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！