在cp1252上强制使用UTF-8(Python3)

本文介绍了在cp1252上强制使用UTF-8(Python3)的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我编写了一些利用Biopython Entrez包装器的代码.代码在我以前的Win10笔记本电脑(Python 3.5.1)上运行良好，但是我刚刚将代码移植到了新的Win10笔记本电脑上，每个笔记本电脑都安装了相同版本的软件包，并且安装了Python，现在出现解码错误./p>

回溯错误导致获取文本的功能-当它应该使用UTF-8时，它尝试使用cp1252解码文本.我知道有人问过类似的问题，但是没有一个人解决过在软件包中发生的这个问题(在我的案例中是Biopython).在Python/lib中复制UTF-8编码文件并将其重命名为cp1252.py可以解决此问题，但这显然不是一个长期解决方案.

 文件"C:\ Users \ arjun \ AppData \ Local \ Programs \ Python \ Python35-32 \ lib \ encodings \ cp1252.py"，在解码中返回codecs.charmap_decode(input，self.errors，decoding_table)[0]UnicodeDecodeError:"charmap"编解码器无法解码位置21715的字节0x81:字符映射为< undefined>

解决方案

如果您使用的是Python 3.x，请使用 io 模块进行阅读( https://docs.python.org/2/library/io.html#io.open ).默认情况下，它将使用在其运行平台上指定的编码.您还可以按照文档中的说明指定自己的编码.

I've written some code that makes use of the Biopython Entrez wrapper. Code was working fine on my previous Win10 laptop (Python 3.5.1), but I've just ported the code to a new Win10 laptop with the same versions of every package and Python installed and I'm now getting a decode error.

The traceback error leads to a function that fetches text - it's attempting to decode the text using cp1252 when it should be using UTF-8. I know that similar questions have been asked, but none have dealt with this problem happening inside a package (Biopython in my case). Copying the UTF-8 encoding file in Python/lib and renaming it to cp1252.py solves the problem, but this obviously is not a long term solution.

File "C:\Users\arjun\AppData\Local\Programs\Python\Python35-32\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]

UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 21715: character maps to <undefined>

解决方案

Use the io module for reading if you're using Python 3.x (https://docs.python.org/2/library/io.html#io.open).By default, it will use the encoding specified on its running platform. You can also specify your own encoding as explained in the docs.

这篇关于在cp1252上强制使用UTF-8(Python3)的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！