本文介绍了在Python中的XML文件的字符转义的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我要逃避一个丑陋的XML文件中的特殊字符(5000行左右长)。这是我要处理XML的例子:
I need to escape special characters in an ugly XML file (5000 lines or so long). Here's an example of XML I have to deal with:
<root>
<element>
<name>name & surname</name>
<mail>[email protected]</mail>
</element>
</root>
下面的问题是字符与&amp;在名字里。你会如何特殊字符转义像这样用Python库?我没有找到。
Here the problem is the character "&" in the name. How would you escape special characters like this with a Python library? I didn't find the way to do it with BeautifulSoup.
推荐答案
如果你不关心在XML你可以使用XML解析器的恢复
选项无效字符(看到):
If you don't care about invalid characters in the xml you could use XML parser's recover
option (see Parsing broken XML with lxml.etree.iterparse):
from lxml import etree
parser = etree.XMLParser(recover=True) # recover from bad characters.
root = etree.fromstring(broken_xml, parser=parser)
print etree.tostring(root)
输出
<root>
<element>
<name>name surname</name>
<mail>[email protected]</mail>
</element>
</root>
这篇关于在Python中的XML文件的字符转义的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!