本文介绍了使用json.dumps()时出现UnicodeDecodeError的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的python列表中有以下字符串(从命令提示符中获取):

I have strings as follows in my python list (taken from command prompt):

>>> o['records'][5790]
(5790, 'Vlv-Gate-Assy-Mdl-\xe1M1-2-\xe19/16-10K-BB Credit Memo            ', 60,
 True, '40141613')
>>>

我尝试了此处提到的建议:更改Python的默认编码?

I have tried suggestions as mentioned here: Changing default encoding of Python?

还将默认编码也更改为utf-16.但是json.dumps()仍然抛出异常,如下所示:

Further changed the default encoding to utf-16 too. But still json.dumps() threw and exception as follows:

>>> write(o)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "okapi_create_master.py", line 49, in write
    o = json.dumps(output)
  File "C:\Python27\lib\json\__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "C:\Python27\lib\json\encoder.py", line 201, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "C:\Python27\lib\json\encoder.py", line 264, in iterencode
    return _iterencode(o, 0)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe1 in position 25: invalid
continuation byte

无法确定此类字符串需要哪种转换才能使json.dumps()起作用.

Can't figure what kind of transformation is required for such strings so that json.dumps() works.

推荐答案

\xe1使用utf-8和utf-16编码无法解码.

\xe1 is not decodable using utf-8, utf-16 encoding.

>>> '\xe1'.decode('utf-8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\encodings\utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe1 in position 0: unexpected end of data
>>> '\xe1'.decode('utf-16')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\encodings\utf_16.py", line 16, in decode
    return codecs.utf_16_decode(input, errors, True)
UnicodeDecodeError: 'utf16' codec can't decode byte 0xe1 in position 0: truncated data

尝试latin-1编码:

Try latin-1 encoding:

>>> record = (5790, 'Vlv-Gate-Assy-Mdl-\xe1M1-2-\xe19/16-10K-BB Credit Memo            ',
...           60, True, '40141613')
>>> json.dumps(record, encoding='latin1')
'[5790, "Vlv-Gate-Assy-Mdl-\\u00e1M1-2-\\u00e19/16-10K-BB Credit Memo            ", 60, true, "40141613"]'

或者,指定ensure_ascii=Falsejson.dumps以使json.dumps不尝试对字符串进行解码.

Or, specify ensure_ascii=False, json.dumps to make json.dumps not try to decode the string.

>>> json.dumps(record, ensure_ascii=False)
'[5790, "Vlv-Gate-Assy-Mdl-\xe1M1-2-\xe19/16-10K-BB Credit Memo            ", 60, true, "40141613"]'

这篇关于使用json.dumps()时出现UnicodeDecodeError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-23 13:43