我有以下表格:

Studya(Alias J)

LOGINID   RCD_NUM         DATE         TYPE
 8745       0         04/15/2018      PRELIM
 6548       0         08/19/2017      PRELIM
 7445       0         10/02/2017      PRELIM
 2867       0         03/19/2018      PRELIM


试用(Alias G)

LOGINID   RCD_NUM         DATE         TYPE
 8745       0          02/15/2017     ROLLUP
 7445       0          07/09/2016     ROLLUP
 2867       0          05/17/2017     ROLLUP
 2867       0          05/28/2017     ROLLUP
 5249       0          06/20/2017     ROLLUP
 1335       0          09/29/2017     ROLLUP
 9238       0          12/03/2017     ROLLUP


SPRINT(Alias H)

 LOGINID   RCD_NUM         DATE        TYPE
  5521       0          01/10/2018     SPRNT
  8745       1          04/04/2018     SPRNT
  3487       0          11/14/2017     SPRNT
  6627       0          05/05/2018     SPRNT


另一个与LOGINID匹配的名为LOGIN的表。

我在SQL Server 2014上运行以下查询:

SELECT 'COUNTS', COUNT(G.LOGINID), COUNT(H.LOGINID), COUNT(J.LOGINID)
 FROM LOGIN F
LEFT OUTER JOIN  TRIALS G ON  F.LOGINID= G.LOGINID AND G.RCD_NUM =
 F.RCD_NUM
LEFT OUTER JOIN  SPRINT H ON  F.LOGINID= H.LOGINID AND H.RCD_NUM =
 F.RCD_NUM
LEFT OUTER JOIN  STUDYA J ON  F.LOGINID= J.LOGINID AND J.RCD_NUM =
 F.RCD_NUM

WHERE ( ( F.EFFDT =
(SELECT MAX(F_ED.EFFDT) FROM PS_JOB F_ED
WHERE F.LOGINID = F_ED.LOGINID
  AND F.RCD_NUM = F_ED.RCD_NUM
  AND F_ED.EFFDT <= SUBSTRING(CONVERT(CHAR,GETDATE(),121), 1, 10))
AND F.EFFSEQ =
(SELECT MAX(F_ES.EFFSEQ) FROM PS_JOB F_ES
 WHERE F.LOGINID = F_ES.LOGINID
  AND F.RCD_NUM = F_ES.RCD_NUM
  AND F.EFFDT = F_ES.EFFDT) )


这是上述查询的结果:

(No column name)   (No column name)   (No column name)   (No column name)
  COUNTS                  9                  5                  5


您可以看到最后一列(COUNT(J.LOGINID))显示5条记录,而在实际表中(上面的查询)实际上只有4条记录。我认为这是因为LOGINID 2867在TRIALS表中有2行,在RESEARCHA中有1行。

我对此进行了研究,建议似乎是在Joins之前进行汇总。我正在努力做到最好的方法。任何帮助表示赞赏!

最佳答案

根据您的数据和用例,您可能会从所引用的方法中获得性能收益:先聚合再加入...

SELECT
 'COUNTS',
 COUNT(G.LOGINID) distinct_g, SUM(G.ROW_COUNT) row_count_g,
 COUNT(H.LOGINID) distinct_h, SUM(H.ROW_COUNT) row_count_h,
 COUNT(J.LOGINID) distinct_j, SUM(J.ROW_COUNT) row_count_j
FROM
 LOGIN F
LEFT OUTER JOIN
(
 SELECT LOGINID, RCD_NUM, COUNT(*) AS ROW_COUNT FROM TRIALS GROUP BY LOGINID, RCD_NUM
)
 G
  ON  F.LOGINID = G.LOGINID
  AND F.RCD_NUM = G.RCD_NUM
LEFT OUTER JOIN
(
 SELECT LOGINID, RCD_NUM, COUNT(*) AS ROW_COUNT FROM SPRINT GROUP BY LOGINID, RCD_NUM
)
 H
  ON  F.LOGINID = H.LOGINID
  AND F.RCD_NUM = H.RCD_NUM
LEFT OUTER JOIN
(
 SELECT LOGINID, RCD_NUM, COUNT(*) AS ROW_COUNT FROM STUDYA GROUP BY LOGINID, RCD_NUM
)
 J
  ON  F.LOGINID = J.LOGINID
  AND F.RCD_NUM = J.RCD_NUM

WHERE ( ( F.EFFDT =
(SELECT MAX(F_ED.EFFDT) FROM PS_JOB F_ED
WHERE F.LOGINID = F_ED.LOGINID
  AND F.RCD_NUM = F_ED.RCD_NUM
  AND F_ED.EFFDT <= SUBSTRING(CONVERT(CHAR,GETDATE(),121), 1, 10))
AND F.EFFSEQ =
(SELECT MAX(F_ES.EFFSEQ) FROM PS_JOB F_ES
 WHERE F.LOGINID = F_ES.LOGINID
  AND F.RCD_NUM = F_ES.RCD_NUM
  AND F.EFFDT = F_ES.EFFDT) )


优化器可以使用谓词下推/类似宏的扩展来确保仅对相关行进行聚合并保留索引的使用。

此模式还允许您获取其他统计信息,例如每种登录名/ rcd组合的最小/最大日期。

编辑:

另一种模式可能包括...

SELECT
 'COUNTS',
 SUM(G.ROW_COUNT) row_count_g,
 SUM(H.ROW_COUNT) row_count_h,
 SUM(J.ROW_COUNT) row_count_j
FROM
 LOGIN F
OUTER APPLY
(
 SELECT COUNT(*) AS ROW_COUNT FROM TRIALS WHERE LOGINID = F.LOGINID AND RCD_NUM = F.RCD_NUM
)
  G
OUTER APPLY
(
 SELECT COUNT(*) AS ROW_COUNT FROM SPRINT WHERE LOGINID = F.LOGINID AND RCD_NUM = F.RCD_NUM
)
  H
OUTER APPLY
(
 SELECT COUNT(*) AS ROW_COUNT FROM STUDYA WHERE LOGINID = F.LOGINID AND RCD_NUM = F.RCD_NUM
)
  J

WHERE ( ( F.EFFDT =
(SELECT MAX(F_ED.EFFDT) FROM PS_JOB F_ED
WHERE F.LOGINID = F_ED.LOGINID
  AND F.RCD_NUM = F_ED.RCD_NUM
  AND F_ED.EFFDT <= SUBSTRING(CONVERT(CHAR,GETDATE(),121), 1, 10))
AND F.EFFSEQ =
(SELECT MAX(F_ES.EFFSEQ) FROM PS_JOB F_ES
 WHERE F.LOGINID = F_ES.LOGINID
  AND F.RCD_NUM = F_ES.RCD_NUM
  AND F.EFFDT = F_ES.EFFDT) )

10-04 10:43