在Python中的XML文件的字符转义

在Python中的XML文件的字符转义

本文介绍了在Python中的XML文件的字符转义的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

限时删除!!

我要逃避一个丑陋的XML文件中的特殊字符(5000行左右长)。这是我要处理XML的例子:

I need to escape special characters in an ugly XML file (5000 lines or so long). Here's an example of XML I have to deal with:

<root>
 <element>
  <name>name & surname</name>
  <mail>[email protected]</mail>
 </element>
</root>

下面的问题是字符与&amp;在名字里。你会如何​​特殊字符转义像这样用Python库?我没有找到。

Here the problem is the character "&" in the name. How would you escape special characters like this with a Python library? I didn't find the way to do it with BeautifulSoup.

推荐答案

如果你不关心在XML你可以使用XML解析器的恢复选项无效字符(看到):

If you don't care about invalid characters in the xml you could use XML parser's recover option (see Parsing broken XML with lxml.etree.iterparse):

from lxml import etree

parser = etree.XMLParser(recover=True) # recover from bad characters.
root = etree.fromstring(broken_xml, parser=parser)
print etree.tostring(root)

输出

<root>
<element>
<name>name  surname</name>
<mail>[email protected]</mail>
</element>
</root>

这篇关于在Python中的XML文件的字符转义的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

1403页,肝出来的..

09-06 16:51