问题描述
Python版本-2.7.6
Pandas版本-0.17.1
MySQLdb版本-1.2.5
在我的数据库(PRODUCT
)中,有一个表(XML_FEED
).表XML_FEED巨大(上百万条记录)我有一个pandas.DataFrame()(PROCESSED_DF
).数据框具有数千行.
现在我需要运行
REPLACE INTO TABLE PRODUCT.XML_FEED
(COL1, COL2, COL3, COL4, COL5),
VALUES (PROCESSED_DF.values)
问题:-
有没有办法在熊猫中运行REPLACE INTO TABLE
?我已经检查了pandas.DataFrame.to_sql()
,但这不是我所需要的.我不喜欢用pandas读取XML_FEED
表,因为它非常大.
直到此版本(0.17.1)
,我都无法在熊猫中找到任何直接的方法.我报告了相同的功能要求.我在项目中通过使用MySQLdb
然后使用DataFrame.to_sql(if_exists='append')
假设
1)product_id是我在表PRODUCT
中的主键2)feed_id是我在表XML_FEED中的主键.
简单版本
import MySQLdb
import sqlalchemy
import pandas
con = MySQLdb.connect('localhost','root','my_password', 'database_name')
con_str = 'mysql+mysqldb://root:my_password@localhost/database_name'
engine = sqlalchemy.create_engine(con_str) #because I am using mysql
df = pandas.read_sql('SELECT * from PRODUCT', con=engine)
df_product_id = df['product_id']
product_id_str = (str(list(df_product_id.values))).strip('[]')
delete_str = 'DELETE FROM XML_FEED WHERE feed_id IN ({0})'.format(product_id_str)
cur = con.cursor()
cur.execute(delete_str)
con.commit()
df.to_sql('XML_FEED', if_exists='append', con=engine)# you can use flavor='mysql' if you do not want to create sqlalchemy engine but it is depreciated
请注意:-REPLACE [INTO]
语法允许我们将INSERT
行插入表中,除了如果发生UNIQUE KEY
(包括PRIMARY KEY
)冲突,旧行在新的INSERT之前被删除,因此没有冲突. >
Python Version - 2.7.6
Pandas Version - 0.17.1
MySQLdb Version - 1.2.5
In my database ( PRODUCT
) , I have a table ( XML_FEED
). The table XML_FEED is huge ( Millions of record )I have a pandas.DataFrame() ( PROCESSED_DF
). The dataframe has thousands of rows.
Now I need to run this
REPLACE INTO TABLE PRODUCT.XML_FEED
(COL1, COL2, COL3, COL4, COL5),
VALUES (PROCESSED_DF.values)
Question:-
Is there a way to run REPLACE INTO TABLE
in pandas? I already checked pandas.DataFrame.to_sql()
but that is not what I need. I do not prefer to read XML_FEED
table in pandas because it very huge.
Till this version (0.17.1)
I am unable find any direct way to do this in pandas. I reported a feature request for the same.I did this in my project with executing some queries using MySQLdb
and then using DataFrame.to_sql(if_exists='append')
Suppose
1) product_id is my primary key in table PRODUCT
2) feed_id is my primary key in table XML_FEED.
SIMPLE VERSION
import MySQLdb
import sqlalchemy
import pandas
con = MySQLdb.connect('localhost','root','my_password', 'database_name')
con_str = 'mysql+mysqldb://root:my_password@localhost/database_name'
engine = sqlalchemy.create_engine(con_str) #because I am using mysql
df = pandas.read_sql('SELECT * from PRODUCT', con=engine)
df_product_id = df['product_id']
product_id_str = (str(list(df_product_id.values))).strip('[]')
delete_str = 'DELETE FROM XML_FEED WHERE feed_id IN ({0})'.format(product_id_str)
cur = con.cursor()
cur.execute(delete_str)
con.commit()
df.to_sql('XML_FEED', if_exists='append', con=engine)# you can use flavor='mysql' if you do not want to create sqlalchemy engine but it is depreciated
Please note:-The REPLACE [INTO]
syntax allows us to INSERT
a row into a table, except that if a UNIQUE KEY
(including PRIMARY KEY
) violation occurs, the old row is deleted prior to the new INSERT, hence no violation.
这篇关于用pandas DataFrame替换mysql数据库表中的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!