我的SQL中有两个表
例如表1-ItemPrice:
DATETIME | ITEM | PRICE
2011-08-28 | ABC 123
2011-09-01 | ABC 125
2011-09-02 | ABC 124
2011-09-03 | ABC 127
2011-09-04 | ABC 126
表2-DayScore:
DATETIME | ITEM | SCORE
2011-08-28 | ABC 1
2011-08-29 | ABC 8
2011-09-01 | ABC 4
2011-09-02 | ABC 2
2011-09-03 | ABC 7
2011-09-04 | ABC 3
我想写一个查询,给定一个项目ID(例如ABC),它将从
ItemPrice
返回该日期的价格(该日期没有价格,则查询不应返回任何内容)。如果找到查询日期的有效价格,则查询应返回(9列)过去三天内来自
ItemPrice
的商品价格(即查询日期之前的最新3个价格)。在接下来的三列中,它应该从
DayScore
返回从ItemPrice中选择的这3个日期的匹配分数。最后选择日期(t-1至t-3)
换句话说,此查询的结果仅以date ='2011-09-03'为例(例如item ='abc')将返回:
DATE | ITEM | PRICE | SCR | PRC_t-1 | PRC_t-2 | PRC_t-3 | SCR_t-1 | SCR_t-2 | SCR_t-3 | DATE_t-1 | DATE_t-2 | DATE_t-3
2011-09-03| ABC | 127 | 7 | 124 | 125 | 123 | 2 | 4 | 1 | 2011-09-02| 2011-09-01| 2011-08-28
....
等等
ItemPrice
表中出现的每个日期。什么是最新颖,最有效的方法来运行此查询(因为它将在数百万行上运行)?
干杯!
最佳答案
完全没有,但是确实可以产生结果。您可能会摆脱一些子选择,使它的sql少一些,但我尝试逐步进行构建,以便可以推断出它在做什么。
核心部分是此选择:
SELECT
Sub2.*
, (Select MAX(IP3.DateTime) FROM ItemPrice IP3 where IP3.DateTime < T_2) AS T_3
FROM
(SELECT
Sub1.*
, (Select MAX(IP2.DateTime) FROM ItemPrice IP2 where IP2.DateTime < T_1) AS T_2
FROM
(SELECT
ItemPrice.DateTime
, (Select MAX(IP.DateTime) FROM ItemPrice IP where IP.DateTime < ItemPrice.DateTime) AS T_1
From ItemPrice) Sub1
) Sub2
这将返回一个带有日期的表(现在是t-1,t-2,t-3)。从那里可以简单地将每个日期的价格和分数连接起来。包括testdata在内的所有内容都变成了这大部分sql
/*
CREATE TABLE ItemPrice (datetime Date, item varchar(3), price int);
CREATE TABLE DayScore ( datetime Date, item varchar(3), score int);
INSERT INTO ItemPrice VALUES ('20110828', 'ABC', 123);
INSERT INTO ItemPrice VALUES ('20110901', 'ABC', 125);
INSERT INTO ItemPrice VALUES ('20110902', 'ABC', 124);
INSERT INTO ItemPrice VALUES ('20110903', 'ABC', 127);
INSERT INTO ItemPrice VALUES ('20110904', 'ABC', 126);
INSERT INTO DayScore VALUES ('20110828', 'ABC', 1);
INSERT INTO DayScore VALUES ('20110829', 'ABC', 8);
INSERT INTO DayScore VALUES ('20110901', 'ABC', 4);
INSERT INTO DayScore VALUES ('20110902', 'ABC', 2);
INSERT INTO DayScore VALUES ('20110903', 'ABC', 7);
INSERT INTO DayScore VALUES ('20110904', 'ABC', 3);
*/
SELECT Hist.*, Current.Item, Current.Price, Current.Score
, Minus1.Price as PRC_1, Minus1.Score SCR_1
, Minus2.Price as PRC_2, Minus2.Score SCR_2
, Minus3.Price as PRC_3, Minus3.Score SCR_3
FROM
(SELECT Sub2.*, (Select MAX(IP3.DateTime) FROM ItemPrice IP3 where IP3.DateTime < T_2) AS T_3
FROM
(SELECT Sub1.*, (Select MAX(IP2.DateTime) FROM ItemPrice IP2 where IP2.DateTime < T_1) AS T_2
FROM
(SELECT ItemPrice.DateTime, (Select MAX(IP.DateTime) FROM ItemPrice IP where IP.DateTime < ItemPrice.DateTime) AS T_1 From ItemPrice) Sub1) Sub2) Hist
INNER JOIN
(SELECT ItemPrice.DateTime, ItemPrice.Item, ItemPrice.Price, DayScore.Score FROM ItemPrice INNER JOIN DayScore ON (ItemPrice.Item = DayScore.Item AND ItemPrice.Datetime = DayScore.DateTime)) CURRENT
ON (Current.DateTime = Hist.DateTime)
LEFT JOIN
(SELECT ItemPrice.DateTime, ItemPrice.Price, DayScore.Score FROM ItemPrice INNER JOIN DayScore ON (ItemPrice.Item = DayScore.Item AND ItemPrice.Datetime = DayScore.DateTime)) MINUS1
ON (Minus1.DateTime = Hist.T_1)
LEFT JOIN
(SELECT ItemPrice.DateTime, ItemPrice.Price, DayScore.Score FROM ItemPrice INNER JOIN DayScore ON (ItemPrice.Item = DayScore.Item AND ItemPrice.Datetime = DayScore.DateTime)) MINUS2
ON (Minus2.DateTime = Hist.T_2)
LEFT JOIN
(SELECT ItemPrice.DateTime, ItemPrice.Price, DayScore.Score FROM ItemPrice INNER JOIN DayScore ON (ItemPrice.Item = DayScore.Item AND ItemPrice.Datetime = DayScore.DateTime)) MINUS3
ON (Minus3.DateTime = Hist.T_3)
WHERE Current.Item = 'ABC'
;
/*
DROP TABLE ItemPrice;
DROP TABLE DayScore;
*/
当您对1M行执行此操作时,我对您的解释计划感到好奇:)如果您具有正确的索引(可能会执行),甚至可能还不那么可怕。