如何在SQLAlchemy的`create_engine`中使用`charset`和`encoding`(创建 pandas 数据框)?

本文介绍了如何在SQLAlchemy的`create_engine`中使用`charset`和`encoding`(创建 pandas 数据框)?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我对charset和 encoding 在 SQLAlchemy 中工作.我了解(并已阅读)字符集和编码之间的区别，而且我对编码历史.

I am very confused with the way charset and encoding work in SQLAlchemy. I understand (and have read) the difference between charsets and encodings, and I have a good picture of the history of encodings.

我在latin1_swedish_ci中的MySQL中有一个表(为什么?可能是由于).我需要创建一个熊猫数据框，在其中我得到正确的字符(而不是奇怪的符号).最初，这是在代码中:

I have a table in MySQL in latin1_swedish_ci (Why? Possible because of this). I need to create a pandas dataframe in which I get the proper characters (and not weird symbols). Initially, this was in the code:

connect_engine = create_engine('mysql://user:[email protected]/db')
sql_query = "select * from table1"
df = pandas.read_sql(sql_query, connect_engine)

我们开始遇到Š字符的麻烦(对应于u'\u0160' unicode，但是却得到了'\ x8a').我希望它能起作用:

We started having troubles with the Š character (corresponding to the u'\u0160' unicode, but instead we get '\x8a'). I expected this to work:

connect_engine = create_engine('mysql://user:[email protected]/db', encoding='utf8')

但是，我继续得到'\x8a'，我意识到，鉴于编码参数的默认值为utf8，这是有意义的.因此，然后，我尝试encoding='latin1'解决该问题:

but, I continue getting '\x8a', which, I realized, makes sense given that the default of the encoding parameter is utf8. So, then, I tried encoding='latin1' to tackle the problem:

connect_engine = create_engine('mysql://user:[email protected]/db', encoding='latin1')

但是，我仍然得到相同的'\ x8a'.需要明确的是，在两种情况下(encoding='utf8'和encoding='latin1')，我都可以执行mystring.decode('latin1')，但不能执行mystring.decode('utf8').

but, I still get the same '\x8a'. To be clear, in both cases (encoding='utf8' and encoding='latin1'), I can do mystring.decode('latin1') but not mystring.decode('utf8').

然后，我在连接字符串(即'mysql://user:[email protected]/db?charset=latin1')中重新发现了charset参数.在尝试了所有可能的字符集和编码组合之后，我发现这一工作有效:

And then, I rediscovered the charset parameter in the connection string, i.e. 'mysql://user:[email protected]/db?charset=latin1'. And after trying all possible combinations of charset and encoding, I found that this one work:

connect_engine = create_engine('mysql://user:[email protected]/db?charset=utf8')

如果有人可以向我解释如何正确使用连接字符串中的charset和create_engine中的encoding参数，我将不胜感激?

I would appreciate if somebody can explain me how to correctly use the charset in the connection string, and the encoding parameter in the create_engine?

encoding

如何在SQLAlchemy的`create_engine`中使用`charset`和`encoding`(创建 pandas 数据框)?

问题描述

推荐答案