我的SQL中有两个表

例如表1-ItemPrice:

DATETIME   |  ITEM | PRICE
2011-08-28 |   ABC   123
2011-09-01 |   ABC   125
2011-09-02 |   ABC   124
2011-09-03 |   ABC   127
2011-09-04 |   ABC   126


表2-DayScore:

DATETIME   |  ITEM | SCORE
2011-08-28 |   ABC    1
2011-08-29 |   ABC    8
2011-09-01 |   ABC    4
2011-09-02 |   ABC    2
2011-09-03 |   ABC    7
2011-09-04 |   ABC    3


我想写一个查询,给定一个项目ID(例如ABC),它将从ItemPrice返回该日期的价格(该日期没有价格,则查询不应返回任何内容)。如果找到查询日期的有效价格,则查询应返回(9列)


过去三天内来自ItemPrice的商品价格(即查询日期之前的最新3个价格)。
在接下来的三列中,它应该从DayScore返回从ItemPrice中选择的这3个日期的匹配分数。
最后选择日期(t-1至t-3)


换句话说,此查询的结果仅以date ='2011-09-03'为例(例如item ='abc')将返回:

DATE      |  ITEM  |  PRICE  |  SCR  | PRC_t-1 | PRC_t-2 | PRC_t-3 | SCR_t-1 | SCR_t-2 | SCR_t-3 | DATE_t-1  | DATE_t-2  | DATE_t-3
2011-09-03|  ABC   |  127    |  7    | 124     | 125     | 123     | 2       | 4       | 1       | 2011-09-02| 2011-09-01| 2011-08-28
....


等等ItemPrice表中出现的每个日期。

什么是最新颖,最有效的方法来运行此查询(因为它将在数百万行上运行)?

干杯!

最佳答案

完全没有,但是确实可以产生结果。您可能会摆脱一些子选择,使它的sql少一些,但我尝试逐步进行构建,以便可以推断出它在做什么。

核心部分是此选择:

SELECT
  Sub2.*
, (Select MAX(IP3.DateTime) FROM ItemPrice IP3 where IP3.DateTime < T_2) AS T_3
FROM
   (SELECT
        Sub1.*
      , (Select MAX(IP2.DateTime) FROM ItemPrice IP2 where IP2.DateTime < T_1) AS T_2
    FROM
       (SELECT
            ItemPrice.DateTime
          , (Select MAX(IP.DateTime) FROM ItemPrice IP where IP.DateTime < ItemPrice.DateTime) AS T_1
        From ItemPrice) Sub1
   ) Sub2


这将返回一个带有日期的表(现在是t-1,t-2,t-3)。从那里可以简单地将每个日期的价格和分数连接起来。包括testdata在内的所有内容都变成了这大部分sql

/*
CREATE TABLE ItemPrice (datetime Date, item varchar(3), price int);
CREATE TABLE DayScore ( datetime Date, item varchar(3), score int);

INSERT INTO ItemPrice VALUES ('20110828', 'ABC', 123);
INSERT INTO ItemPrice VALUES ('20110901', 'ABC', 125);
INSERT INTO ItemPrice VALUES ('20110902', 'ABC', 124);
INSERT INTO ItemPrice VALUES ('20110903', 'ABC', 127);
INSERT INTO ItemPrice VALUES ('20110904', 'ABC', 126);

INSERT INTO DayScore VALUES ('20110828', 'ABC', 1);
INSERT INTO DayScore VALUES ('20110829', 'ABC', 8);
INSERT INTO DayScore VALUES ('20110901', 'ABC', 4);
INSERT INTO DayScore VALUES ('20110902', 'ABC', 2);
INSERT INTO DayScore VALUES ('20110903', 'ABC', 7);
INSERT INTO DayScore VALUES ('20110904', 'ABC', 3);
*/

SELECT Hist.*, Current.Item, Current.Price, Current.Score
, Minus1.Price as PRC_1, Minus1.Score SCR_1
, Minus2.Price as PRC_2, Minus2.Score SCR_2
, Minus3.Price as PRC_3, Minus3.Score SCR_3
FROM
    (SELECT Sub2.*, (Select MAX(IP3.DateTime) FROM ItemPrice IP3 where IP3.DateTime < T_2) AS T_3
    FROM
        (SELECT Sub1.*, (Select MAX(IP2.DateTime) FROM ItemPrice IP2 where IP2.DateTime < T_1) AS T_2
        FROM
            (SELECT ItemPrice.DateTime, (Select MAX(IP.DateTime) FROM ItemPrice IP where IP.DateTime < ItemPrice.DateTime) AS T_1 From ItemPrice) Sub1) Sub2) Hist
INNER JOIN
    (SELECT ItemPrice.DateTime, ItemPrice.Item, ItemPrice.Price, DayScore.Score FROM ItemPrice INNER JOIN DayScore ON (ItemPrice.Item = DayScore.Item AND ItemPrice.Datetime = DayScore.DateTime)) CURRENT
ON (Current.DateTime = Hist.DateTime)
LEFT JOIN
    (SELECT ItemPrice.DateTime, ItemPrice.Price, DayScore.Score FROM ItemPrice INNER JOIN DayScore ON (ItemPrice.Item = DayScore.Item AND ItemPrice.Datetime = DayScore.DateTime)) MINUS1
ON (Minus1.DateTime = Hist.T_1)
LEFT JOIN
    (SELECT ItemPrice.DateTime, ItemPrice.Price, DayScore.Score FROM ItemPrice INNER JOIN DayScore ON (ItemPrice.Item = DayScore.Item AND ItemPrice.Datetime = DayScore.DateTime)) MINUS2
ON (Minus2.DateTime = Hist.T_2)
LEFT JOIN
    (SELECT ItemPrice.DateTime, ItemPrice.Price, DayScore.Score FROM ItemPrice INNER JOIN DayScore ON (ItemPrice.Item = DayScore.Item AND ItemPrice.Datetime = DayScore.DateTime)) MINUS3
ON (Minus3.DateTime = Hist.T_3)
WHERE Current.Item = 'ABC'

;

/*
DROP TABLE ItemPrice;
DROP TABLE DayScore;
*/


当您对1M行执行此操作时,我对您的解释计划感到好奇:)如果您具有正确的索引(可能会执行),甚至可能还不那么可怕。

09-26 15:56