我有一个pandas数据框'df'
正在尝试上载到Netezza数据库。我一直在尝试使用DataFrame.to_sql
并创建相应的SQLAlchemy
引擎来执行此操作:
import pandas
import sqlalchemy
import urllib
def upload_test(data, table):
quoted = urllib.quote_plus('DRIVER={NetezzaSQL};Server=SERVER;Database=DATA_BASE;UID=uid;PWD=pwd;Port=5480;')
engine = create_engine('mssql+pyodbc:///?odbc_connect={}'.format(quoted))
data.to_sql(name=table, con=engine, if_exists='append', index=False)
df = pandas.DataFrame(
{
'VAR1': pandas.Series(['2016-05-01', '2016-05-02'])
, 'VAR2': pandas.Series([2500, 2500])
, 'VAR3': pandas.Series([211232, 211232])
}
)
upload_test(data=df, table='TABLE')
这只会在我的控制台的回溯中返回一个SQL错误:
Traceback (most recent call last):
File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\IPython\core\interactiveshell.py", line 2885, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-68-b2d5c19f9472>", line 19, in <module>
upload_test(data=df, table='TABLE')
File "<ipython-input-68-b2d5c19f9472>", line 4, in upload_test
data.to_sql(name=table, con=engine, if_exists='append', index=False)
File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\pandas\core\generic.py", line 1003, in to_sql
dtype=dtype)
File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\pandas\io\sql.py", line 569, in to_sql
chunksize=chunksize, dtype=dtype)
File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\pandas\io\sql.py", line 1240, in to_sql
table.create()
File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\pandas\io\sql.py", line 685, in create
if self.exists():
File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\pandas\io\sql.py", line 673, in exists
return self.pd_sql.has_table(self.name, self.schema)
File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\pandas\io\sql.py", line 1263, in has_table
schema or self.meta.schema,
File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\sqlalchemy\engine\base.py", line 1972, in run_callable
return conn.run_callable(callable_, *args, **kwargs)
File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\sqlalchemy\engine\base.py", line 1477, in run_callable
return callable_(self, *args, **kwargs)
File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\sqlalchemy\dialects\mssql\base.py", line 1466, in wrap
tablename, dbname, owner, schema, **kw)
File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\sqlalchemy\dialects\mssql\base.py", line 1475, in _switch_db
return fn(*arg, **kw)
File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\sqlalchemy\dialects\mssql\base.py", line 1621, in has_table
c = connection.execute(s)
File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\sqlalchemy\engine\base.py", line 914, in execute
return meth(self, multiparams, params)
File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\sqlalchemy\sql\elements.py", line 323, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\sqlalchemy\engine\base.py", line 1010, in _execute_clauseelement
compiled_sql, distilled_params
File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\sqlalchemy\engine\base.py", line 1146, in _execute_context
context)
File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\sqlalchemy\engine\base.py", line 1341, in _handle_dbapi_exception
exc_info
File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\sqlalchemy\util\compat.py", line 200, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb)
File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\sqlalchemy\engine\base.py", line 1139, in _execute_context
context)
File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\sqlalchemy\engine\default.py", line 450, in do_execute
cursor.execute(statement, parameters)
ProgrammingError: (pyodbc.ProgrammingError) ('42000', '[42000] ERROR: \'SELECT [COLUMNS_1].[TABLE_SCHEMA], [COLUMNS_1].[TABLE_NAME], [COLUMNS_1].[COLUMN_NAME], [COLUMNS_1].[IS_NULLABLE], [COLUMNS_1].[DATA_TYPE], [COLUMNS_1].[ORDINAL_POSITION], [COLUMNS_1].[CHARACTER_MAXIMUM_LENGTH], [COLUMNS_1].[NUMERIC_PRECISION], [COLUMNS_1].[NUMERIC_SCALE], [COLUMNS_1].[COLUMN_DEFAULT], [COLUMNS_1].[COLLATION_NAME] FROM [INFORMATION_SCHEMA].[COLUMNS] AS [COLUMNS_1] WHERE [COLUMNS_1].[TABLE_NAME] = NULL AND [COLUMNS_1].[TABLE_SCHEMA] = NULL limit 0\'\nerror ^ found "[" (at char 8) expecting an identifier found a keyword (27) (SQLPrepare)') [SQL: u'SELECT [COLUMNS_1].[TABLE_SCHEMA], [COLUMNS_1].[TABLE_NAME], [COLUMNS_1].[COLUMN_NAME], [COLUMNS_1].[IS_NULLABLE], [COLUMNS_1].[DATA_TYPE], [COLUMNS_1].[ORDINAL_POSITION], [COLUMNS_1].[CHARACTER_MAXIMUM_LENGTH], [COLUMNS_1].[NUMERIC_PRECISION], [COLUMNS_1].[NUMERIC_SCALE], [COLUMNS_1].[COLUMN_DEFAULT], [COLUMNS_1].[COLLATION_NAME] \nFROM [INFORMATION_SCHEMA].[COLUMNS] AS [COLUMNS_1] \nWHERE [COLUMNS_1].[TABLE_NAME] = ? AND [COLUMNS_1].[TABLE_SCHEMA] = ?'] [parameters: (u'TABLE', u'dbo')]
我知道连接是可靠的,因为我可以很好地使用它读取数据:
connection = engine.connect()
result = connection.execute("SELECT * FROM TABLE LIMIT 100")
for row in result:
print row
从我在其他网站上看到的情况来看,问题在于我为SQLAlchemy引擎选择了方言,但我不确定这是否是问题所在。是否有其他对象可以将数据帧转换为?我应该一次尝试在表中插入一行吗?
谢谢!
最佳答案
我相信这里的问题是,正如您所猜测的,在您为SQLAlchemy选择方言时。
从错误输出的最后一行:
...found "[" (at char 8) expecting an identifier found a keyword...
它用方括号分隔列和表的名称,这是一个MSSQL ism,Netezza不会接受。
我对SQLAlchemy没有任何经验,但是如果它没有Netezza特定的方言可供选择,那么试试Postgres方言中的一种,因为这是Netezza的家谱。
关于python - 尝试将Pandas DataFrame上传到Netezza时出错,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/37423716/