问题描述
Hello All,
我在第二个例子中的韩文
文本开头出现错误信息。我做错了我怎么用我的韩国语编码?b $ b编码我的韩语?我是否需要更多关于它的包装而不是简单的
报价?是否有某种XML语法用于指示Unicode
字符串,或者Elementree库是否不支持读取
Unicode?
这里是我的测试片段:
来自elementtree import ElementTree
vocabXML = ElementTree.parse(''test2.xml'')。getroot( )
我有两个数据文件:
这个工作原理:
<?xml version = " 1.0" encoding =" UTF-8"?>
< Vocab>
< Word L1 =''Hahha''>< / Word>
< / Vocab>
这个失败:
<?xml version =" 1.0" encoding =" UTF-8"?>
< Vocab>
< Word L1 ="ì?'???í?? ?? ?? ??!">< / Word>
< / Vocab>
Hello All,
I am getting an error of not well-formed at the beginning of the Korean
text in the second example. I am doing something wrong with how I am
encoding my Korean? Do I need more of a wrapper about it than simple
quotes? Is there some sort of XML syntax for indicating a Unicode
string, or does the Elementree library just not support reading of
Unicode?
here is my test snippet:
from elementtree import ElementTree
vocabXML = ElementTree.parse(''test2.xml'').getroot()
where I have two data files:
this one works:
<?xml version="1.0" encoding="UTF-8"?>
<Vocab>
<Word L1=''Hahha''></Word>
</Vocab>
this one fails:
<?xml version="1.0" encoding="UTF-8"?>
<Vocab>
<Word L1="ì?′???í??ì??ì??!"></Word>
</Vocab>
推荐答案
这在我的机器上工作得很好。
确切的错误信息是什么?
是什么
print repr(open(" test2.xml")。read())
打印在你的机器上?
如果你试图解析会发生什么
< Vocab>
< Word L1 =&어녕하세요!" />
< / Vocab>
?
< / F>
this works just fine on my machine.
what''s the exact error message?
what does
print repr(open("test2.xml").read())
print on your machine?
what happens if you attempt to parse
<Vocab>
<Word L1="어녕하세요!" />
</Vocab>
?
</F>
您应该将文件发布到网络上的某个位置。 (我不希望Usenet
正确传输它。)
(只是跳入可能会为你节省一个回复周期。)
You should post the file somewhere on the web. (I wouldn''t expect Usenet
to transmit it properly.)
(Just jumping in to possibly save you a reply cycle.)
这篇关于ElementTree无法解析UTF-8 Unicode?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!