我有一个脚本,可以使用executemany()在表中插入DataFrame。

问题在于该表具有一个ID作为主键,有时可能会插入具有相同ID的行。

我想知道是否有一种简单的方法来处理这种异常并继续执行executemany()

我在想的替代方法是检查表中DataFrame的所有ID,然后在插入数据库之前将其删除...但是我不知道这是否可以执行...

我的代码:

params = (tuple(row) for _, row in df.iterrows())
sql = '''INSERT INTO stilingue.stalker_comments values(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)'''
start = time.time()
try:
    self.cursor.executemany(sql, params)
    self.conn.commit()
except Exception as e:
    print(e)
    self.conn.rollback()
    print('Something went wrong...')
end = time.time()
print('Execution time: {0:.2f} seconds.'.format(end-start))


数据框:

    channel followers   gender  hashtags    interactions    likes   location    mentions    name    page_comment    ... text    themes  uid user_image_url  user_url    username    verified    videoplays  business    rt_count
0   Inbox do Facebook   0   Não Definido        0   0           Midiam Mendes   False   ... Sacanagem isso né?? Poorq vocês dizeram que o ...       1995608377159933    https://storage.googleapis.com/usersstilingue/...           False   0   Itaú    0
1   Inbox do Facebook   0   Não Definido        0   0           Midiam Mendes   False   ... Eu tenho provas , e posso processar vocês!!     1995608377159933    https://storage.googleapis.com/usersstilingue/...           False   0   Itaú    0
2   Inbox do Facebook   0   Não Definido        0   0           Midiam Mendes   False   ... Isso é um absurdo       1995608377159933    https://storage.googleapis.com/usersstilingue/...           False   0   Itaú    0


追溯:

('23000', "[23000] [Microsoft][ODBC SQL Server Driver][SQL Server]Violation of PRIMARY KEY constraint 'PK__stalker___DD37D91A4691B0F7'. Cannot insert duplicate key in object 'stilingue.stalker_comments'. The duplicate key value is (m__g64-pbys7OlEvp8xmfyktlNIHrUPQPiNrcKrPVOF_Lj84OJfN4WtAJ92lj7YnzAOQ1B7EDCJf85k_UcwB0-4Q). (2627) (SQLExecDirectW); [23000] [Microsoft][ODBC SQL Server Driver][SQL Server]The statement has been terminated. (3621)")

最佳答案

如果数据不大,最简单的方法是在数据库中创建一个没有PK的临时表。然后将数据插入到该临时文件中,从临时文件中删除重复项(如果具有SQL Server数据库,则可以使用以下语法来删除重复项),然后将数据插入到主表中。

 WITH table_1 AS
(SELECT *,RN=ROW_NUMBER() OVER(PARTITION BY [pk_field]
 order by date)
 FROM [temporary_table])
 DELETE FROM table_1  WHERE RN>1

关于python - 如何使用pyodbc executemany()处理主键约束冲突,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/54308346/

10-08 21:04