本文介绍了lxml更改Unicode字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在使用lxml来读取xml文件并更改一些细节.但是,在运行它时,我发现即使我只是使用lxml来读取文件,然后再次将其写出,如下所示:
I am using lxml to read through an xml file and change a few details. However, when running it I find that even if I just use lxml to read the file and then write it out again, as below:
fil='iTunes Music Library.XML'
tre=etree.parse(fil)
tre.write('temp.xml')
我发现Queensrÿche转换为Queensrÿche
.有人知道如何解决这个问题吗?
I find Queensrÿche converted to Queensrÿche
. Anyone know how to fix this?
推荐答案
将最后一行更改为:
tre.write('temp.xml', encoding='utf-8')
否则lxml
以ASCII编码写入XML,因此必须转义所有非ASCII字符.
Otherwise lxml
writes XML in ASCII encoding, so it have to escape all non-ASCII characters.
这篇关于lxml更改Unicode字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!