我有一个pandas数据框'df'正在尝试上载到Netezza数据库。我一直在尝试使用DataFrame.to_sql并创建相应的SQLAlchemy引擎来执行此操作:

import pandas
import sqlalchemy
import urllib

def upload_test(data, table):
    quoted = urllib.quote_plus('DRIVER={NetezzaSQL};Server=SERVER;Database=DATA_BASE;UID=uid;PWD=pwd;Port=5480;')
    engine = create_engine('mssql+pyodbc:///?odbc_connect={}'.format(quoted))
    data.to_sql(name=table, con=engine, if_exists='append', index=False)

df = pandas.DataFrame(
    {
        'VAR1': pandas.Series(['2016-05-01', '2016-05-02'])
        , 'VAR2': pandas.Series([2500, 2500])
        , 'VAR3': pandas.Series([211232, 211232])
    }
)
upload_test(data=df, table='TABLE')

这只会在我的控制台的回溯中返回一个SQL错误:
Traceback (most recent call last):
  File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\IPython\core\interactiveshell.py", line 2885, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-68-b2d5c19f9472>", line 19, in <module>
    upload_test(data=df, table='TABLE')
  File "<ipython-input-68-b2d5c19f9472>", line 4, in upload_test
    data.to_sql(name=table, con=engine, if_exists='append', index=False)
  File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\pandas\core\generic.py", line 1003, in to_sql
    dtype=dtype)
  File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\pandas\io\sql.py", line 569, in to_sql
    chunksize=chunksize, dtype=dtype)
  File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\pandas\io\sql.py", line 1240, in to_sql
    table.create()
  File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\pandas\io\sql.py", line 685, in create
    if self.exists():
  File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\pandas\io\sql.py", line 673, in exists
    return self.pd_sql.has_table(self.name, self.schema)
  File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\pandas\io\sql.py", line 1263, in has_table
    schema or self.meta.schema,
  File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\sqlalchemy\engine\base.py", line 1972, in run_callable
    return conn.run_callable(callable_, *args, **kwargs)
  File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\sqlalchemy\engine\base.py", line 1477, in run_callable
    return callable_(self, *args, **kwargs)
  File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\sqlalchemy\dialects\mssql\base.py", line 1466, in wrap
    tablename, dbname, owner, schema, **kw)
  File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\sqlalchemy\dialects\mssql\base.py", line 1475, in _switch_db
    return fn(*arg, **kw)
  File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\sqlalchemy\dialects\mssql\base.py", line 1621, in has_table
    c = connection.execute(s)
  File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\sqlalchemy\engine\base.py", line 914, in execute
    return meth(self, multiparams, params)
  File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\sqlalchemy\sql\elements.py", line 323, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\sqlalchemy\engine\base.py", line 1010, in _execute_clauseelement
    compiled_sql, distilled_params
  File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\sqlalchemy\engine\base.py", line 1146, in _execute_context
    context)
  File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\sqlalchemy\engine\base.py", line 1341, in _handle_dbapi_exception
    exc_info
  File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\sqlalchemy\util\compat.py", line 200, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb)
  File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\sqlalchemy\engine\base.py", line 1139, in _execute_context
    context)
  File "C:\Anaconda3\envs\python_2_7_Anaconda\lib\site-packages\sqlalchemy\engine\default.py", line 450, in do_execute
    cursor.execute(statement, parameters)
ProgrammingError: (pyodbc.ProgrammingError) ('42000', '[42000] ERROR:  \'SELECT [COLUMNS_1].[TABLE_SCHEMA], [COLUMNS_1].[TABLE_NAME], [COLUMNS_1].[COLUMN_NAME], [COLUMNS_1].[IS_NULLABLE], [COLUMNS_1].[DATA_TYPE], [COLUMNS_1].[ORDINAL_POSITION], [COLUMNS_1].[CHARACTER_MAXIMUM_LENGTH], [COLUMNS_1].[NUMERIC_PRECISION], [COLUMNS_1].[NUMERIC_SCALE], [COLUMNS_1].[COLUMN_DEFAULT], [COLUMNS_1].[COLLATION_NAME] FROM [INFORMATION_SCHEMA].[COLUMNS] AS [COLUMNS_1] WHERE [COLUMNS_1].[TABLE_NAME] = NULL AND [COLUMNS_1].[TABLE_SCHEMA] = NULL limit 0\'\nerror           ^ found "[" (at char 8) expecting an identifier found a keyword (27) (SQLPrepare)') [SQL: u'SELECT [COLUMNS_1].[TABLE_SCHEMA], [COLUMNS_1].[TABLE_NAME], [COLUMNS_1].[COLUMN_NAME], [COLUMNS_1].[IS_NULLABLE], [COLUMNS_1].[DATA_TYPE], [COLUMNS_1].[ORDINAL_POSITION], [COLUMNS_1].[CHARACTER_MAXIMUM_LENGTH], [COLUMNS_1].[NUMERIC_PRECISION], [COLUMNS_1].[NUMERIC_SCALE], [COLUMNS_1].[COLUMN_DEFAULT], [COLUMNS_1].[COLLATION_NAME] \nFROM [INFORMATION_SCHEMA].[COLUMNS] AS [COLUMNS_1] \nWHERE [COLUMNS_1].[TABLE_NAME] = ? AND [COLUMNS_1].[TABLE_SCHEMA] = ?'] [parameters: (u'TABLE', u'dbo')]

我知道连接是可靠的,因为我可以很好地使用它读取数据:
connection = engine.connect()
result = connection.execute("SELECT * FROM TABLE LIMIT 100")
for row in result:
    print row

从我在其他网站上看到的情况来看,问题在于我为SQLAlchemy引擎选择了方言,但我不确定这是否是问题所在。是否有其他对象可以将数据帧转换为?我应该一次尝试在表中插入一行吗?
谢谢!

最佳答案

我相信这里的问题是,正如您所猜测的,在您为SQLAlchemy选择方言时。
从错误输出的最后一行:

...found "[" (at char 8) expecting an identifier found a keyword...

它用方括号分隔列和表的名称,这是一个MSSQL ism,Netezza不会接受。
我对SQLAlchemy没有任何经验,但是如果它没有Netezza特定的方言可供选择,那么试试Postgres方言中的一种,因为这是Netezza的家谱。

关于python - 尝试将Pandas DataFrame上传到Netezza时出错,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/37423716/

10-12 22:52