本文介绍了ElementTree无法解析UTF-8 Unicode?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Hello All,


我在第二个例子中的韩文

文本开头出现错误信息。我做错了我怎么用我的韩国语编码?b $ b编码我的韩语?我是否需要更多关于它的包装而不是简单的

报价?是否有某种XML语法用于指示Unicode

字符串,或者Elementree库是否不支持读取
Unicode?


这里是我的测试片段:


来自elementtree import ElementTree

vocabXML = ElementTree.parse(''test2.xml'')。getroot( )


我有两个数据文件:


这个工作原理:

<?xml version = " 1.0" encoding =" UTF-8"?>

< Vocab>

< Word L1 =''Hahha''>< / Word>

< / Vocab>


这个失败:

<?xml version =" 1.0" encoding =" UTF-8"?>

< Vocab>

< Word L1 ="ì?'???í?? ?? ?? ??!">< / Word>

< / Vocab>

Hello All,

I am getting an error of not well-formed at the beginning of the Korean
text in the second example. I am doing something wrong with how I am
encoding my Korean? Do I need more of a wrapper about it than simple
quotes? Is there some sort of XML syntax for indicating a Unicode
string, or does the Elementree library just not support reading of
Unicode?

here is my test snippet:

from elementtree import ElementTree
vocabXML = ElementTree.parse(''test2.xml'').getroot()

where I have two data files:

this one works:
<?xml version="1.0" encoding="UTF-8"?>
<Vocab>
<Word L1=''Hahha''></Word>
</Vocab>

this one fails:
<?xml version="1.0" encoding="UTF-8"?>
<Vocab>
<Word L1="ì?′???í??ì??ì??!"></Word>
</Vocab>

推荐答案




这在我的机器上工作得很好。


确切的错误信息是什么?


是什么


print repr(open(" test2.xml")。read())


打印在你的机器上?


如果你试图解析会发生什么


< Vocab>

< Word L1 =&어녕하세요!" />

< / Vocab>





< / F>



this works just fine on my machine.

what''s the exact error message?

what does

print repr(open("test2.xml").read())

print on your machine?

what happens if you attempt to parse

<Vocab>
<Word L1="어녕하세요!" />
</Vocab>

?

</F>






您应该将文件发布到网络上的某个位置。 (我不希望Usenet

正确传输它。)


(只是跳入可能会为你节省一个回复周期。)



You should post the file somewhere on the web. (I wouldn''t expect Usenet
to transmit it properly.)

(Just jumping in to possibly save you a reply cycle.)


这篇关于ElementTree无法解析UTF-8 Unicode?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

06-14 13:12