本文介绍了lxml更改Unicode字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用lxml来读取xml文件并更改一些细节.但是,在运行它时,我发现即使我只是使用lxml来读取文件,然后再次将其写出,如下所示:

I am using lxml to read through an xml file and change a few details. However, when running it I find that even if I just use lxml to read the file and then write it out again, as below:

fil='iTunes Music Library.XML'
tre=etree.parse(fil)
tre.write('temp.xml')

我发现Queensrÿche转换为Queensrÿche.有人知道如何解决这个问题吗?

I find Queensrÿche converted to Queensrÿche. Anyone know how to fix this?

推荐答案

将最后一行更改为:

tre.write('temp.xml', encoding='utf-8')

否则lxml以ASCII编码写入XML,因此必须转义所有非ASCII字符.

Otherwise lxml writes XML in ASCII encoding, so it have to escape all non-ASCII characters.

这篇关于lxml更改Unicode字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-19 15:58