问题描述
我有一个应用程序,该应用程序是用简单的SQL用Java编写的,因此这里没有自定义的MySQL或SQL Server-它可能必须在两者上运行.一种数据持久化操作必须从数据库中获取数据,将其与已提交的数据进行比较,然后进行相应的插入,更新或删除.
I have an application which I'm writing in Java with simple SQL, so no custom MySQL or SQL Server here - it might have to run on either. One data persist operation has to grab the data out of the DB, compare it with what has been submitted and then insert, update or delete accordingly.
我通过批量处理JDBC调用大大提高了操作性能.
I've improved the performance of the operation considerably by batching the JDBC calls.
因此,我的 INSERTs -我只是为要插入的整个数据集调用Statement.addBatch()
方法,然后JDBC驱动程序会创建
So my INSERTs - I just call the Statement.addBatch()
method for the whole data set to be inserted, and the JDBC driver creates
INSERT INTO data (parentId, seriesDate, valueDate, value)
VALUES (a,b,c,d),(a,b,e,f),(a,b,g,h)... etc
删除-我只是用
DELETE FROM data WHERE parentId = a AND seriesDate = b;
,然后我可以重新插入它们. (最好采用另一种方法,即编写一个较长的
and I can re-insert them. (It may be better to take another approach by composing a big long
DELETE FROM data WHERE (parentId = 1 AND seriesDate = b)
OR (parentId = 2 AND seriesDate = c)
OR (parentId = 3 AND seriesDate = d) ...
但这不是这里的问题,我的主要问题是更新确实很慢-是 INSERTs
but that's not the issue here, my main problem is that the UPDATEs are really slow - twice as slow as the INSERTs
我得到1000条单独的语句:
I get 1000 separate statements:
UPDATE data SET value = 4
WHERE parentId = 1 AND seriesDate = '' AND valueDate = '';
在SQL Server中,更新与 INSERT 一样快,但是在MySQL中,我看到它的运行速度慢了10倍.
In SQL Server, the UPDATEs are just as quick as the INSERTs, but in MySQL I am seeing it run 10 x slower.
我希望我忘记了一些相互兼容的方法,或者错过了一些我需要调整的JDBC连接配置,也许与我在每批中放入的项目数结合在一起.
I am hoping I've forgotten some mutually compatible approach, or missed out on some JDBC connection configuration I need to adjust, maybe in conjunction with the number of items I'm putting in each batch.
[UPDATE 2018-05-17] 这是所请求的DDL-遗憾的是,我无法更改此DDL,因此任何涉及架构更改的建议都将无济于事,至少不是这样年:(
[UPDATE 2018-05-17] Here's the requested DDL - and unfortunately I can't change this (yet) so any suggestions that involve schema changes won't help, at least not this year :(
CREATE TABLE data (
parentId INT UNSIGNED NOT NULL,
seriesDate DATE NOT NULL,
valueDate DATE NOT NULL,
value FLOAT NOT NULL,
versionstamp INT UNSIGNED NOT NULL DEFAULT 1,
createdDate DATETIME DEFAULT CURRENT_TIMESTAMP,
last_modified DATETIME DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
CONSTRAINT pk_data PRIMARY KEY (parentId, seriesDate, valueDate),
CONSTRAINT fk_data_forecastid FOREIGN KEY (parentId)
REFERENCES forecast (id)
) MAX_ROWS 222111000;
CREATE TRIGGER trg_data_update BEFORE UPDATE ON data
FOR EACH ROW SET NEW.versionstamp = OLD.versionstamp + 1;
CREATE INDEX ix_data_seriesdate ON `data` (seriesDate);
插入:
INSERT INTO `data` (`parentId`, `valueDate`, `value`, `seriesDate`)
VALUES (52031,'2010-04-20',1.12344,'2013-01-10')
EXPLAIN PLAN:
id: 1
select_type: INSERT
table: data
partitions:
type: ALL
possible_keys: PRIMARY,ix_data_seriesdate
和更新:
UPDATE `data` SET `value` = -2367.0
WHERE `parentId` = 52005 AND `seriesDate` = '2018-04-20' AND `valueDate` = '2000-02-11'
EXPLAIN PLAN:
id: 1
select_type: UPDATE
table: data
partitions:
type: range
possible_keys: PRIMARY,ix_data_seriesdate
key: PRIMARY
key_len: 10
ref: const,const,const
rows: 1
filtered: 100
Extra: Using where
和删除:
DELETE FROM `data` WHERE `parentId` = 52030 AND `seriesDate` = '2018-04-20'
EXPLAIN PLAN:
id: 1
select_type: DELETE
table: data
partitions:
type: range
possible_keys: PRIMARY,ix_data_seriesdate
key: PRIMARY
key_len: 7
ref: const,const
rows: 1
filtered: 100
Extra: Using where
FYI 2字段自动更新-last_modified
由ON UPDATE
子句自动更新,versionstamp
由触发器(同样,我无法放弃该功能).
FYI 2 fields are updated automatically - last_modified
by the ON UPDATE
clause and versionstamp
by the trigger (and again, I can't ditch that functionality).
推荐答案
我发现可以改进UPDATE语句的方法:
Ways I've found to improve UPDATE statements:
- 使用辅助表(可以分批"更新)
- 检查不必要的触发器
- 改善索引(针对WHERE子句)
- OLAP或OLTP中间人临时表(它们允许进行一批更新)
E.G.
CREATE TABLE #TempData (
parentId INT UNSIGNED NOT NULL,
seriesDate DATE NOT NULL,
valueDate DATE NOT NULL,
value FLOAT NOT NULL
);
INSERT INTO #TempData ( parentId, seriesDate, valueDate, value ) VALUES ( .... ), ( .... ), ( .... );
UPDATE
data
SET
value = #TempData.value
FROM
#TempData
WHERE
data.parentId = #TempData.parentId AND
data.seriesDate = #TempData.seriesDate AND
data.valueDate = #TempData.valueDate;
这篇关于如何提高mySQL vs SQL Server中的一系列UPDATE的速度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!