问题描述
我正在尝试从MySQL RDB创建数据CSV,以将其移至Amazon Redshift.但是,其中一个字段包含描述,其中一些描述包含’"字符或右单引号.在我运行代码之前,它会给我
I am trying to create a CSV of data from a MySQL RDB to move it over to Amazon Redshift. However, one of the fields contains descriptions and some of those descriptions contain the '’' character, or the right single quotation mark. before when I would run the code, it would give me
UnicodeEncodeError: 'charmap' codec can't encode character '\x92' in position 62: character maps to <undefined>
然后我尝试使用REPLACE尝试摆脱正确的单引号.
I then tried using REPLACE to attempt to get rid of the right single quotation marks.
db = pymysql.connect(host='host', port=3306, user="user", passwd="password", db="db", autocommit=True)
cur = db.cursor()
#cur.execute("call inv1_view_prod.`Email_agg`")
cur.execute("""select field_1,
field_2,
field_3,
field_4,
replace(field_4_desc,"’","") field_4_desc,
field_5,
field_6,
field_7
from table_name """)
emails = cur.fetchall()
with open('O:\file\path\to\file_name.csv','w') as fileout:
writer = csv.writer(fileout)
writer.writerows(emails)
time.sleep(1)
但是,这给了我错误:
UnicodeEncodeError: 'latin-1' codec can't encode character '\u2019' in position 132: ordinal not in range(256)
我注意到132是SQL语句中右单引号的位置,因此我认为代码本身可能有问题.我尝试在REPLACE语句中使用常规的直撇号而不是正确的单引号,但是这并没有替换字符,并且仍然返回原始错误.有谁知道为什么它不接受单引号以及如何解决?
And I noticed 132 is the position of the right single quotation mark in the SQL statement so I beieve the code itself may be having an issue with it. I tried using the regular straight apostrophe instead of the right single quotation mark in the REPLACE statement, however this did not replace the character and still came back with the original error. Does anyone know why it won't accept the single quote and how to fix it?
推荐答案
\ u2019
是Unicode,用于'
,UTF-8十六进制 E28099
,这是正确的单引号".直接latin1等效为十六进制 92
.某些文字处理产品使用它代替撇号('
).
\u2019
is Unicode for ’
, UTF-8 hex E28099
, which is a "RIGHT SINGLE QUOTATION MARK". The direct latin1 equivalent is hex 92
. Some word processing products use that instead of apostrophe ('
).
您收到错误消息,不是因为您无法处理字符,而是因为配置无法声明在何处使用哪种编码.
You are getting the error messages, not because you can't handle the character, but because the configuration fails to declare which encoding is used where.
"132"似乎无关紧要: 132 84 E2809E„„
"132" seems irrelevant: 132 84 E2809E „ „
关于Python的注释: http://mysql.rjweb.org/doc.php/charcoll#python
有关其他字符集问题的注释:出现问题UTF-8字符;我看到的不是我存储的内容
Notes on Python: http://mysql.rjweb.org/doc.php/charcoll#python
Notes on other charset issues: Trouble with UTF-8 characters; what I see is not what I stored
在不了解架构或Python配置的情况下,我不能更加具体.
Without knowing the schema or the Python configuration, I can't be more specific.
这篇关于Unicode编码错误'latin-1'编解码器无法编码字符'\ u2019'的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!