我试图获取一个表,其中字符串被排列在由三个字符串组成的集群中的记录中,并返回一个查询,其中数据位于一个记录中。
表定义为:

CREATE TABLE Xerox.FIM2 (
  KPIN CHAR(18) NOT NULL COMMENT 'Part Number',
  KSEQ DECIMAL(2, 0) NOT NULL COMMENT 'Sequence Number',
  SDS1 CHAR(40) NOT NULL COMMENT 'Supplemental Description 1',
  SDS2 CHAR(30) NOT NULL COMMENT 'Supplemental Description 2',
  SDS3 CHAR(30) NOT NULL COMMENT 'Supplemental Description 3',
  PRIMARY KEY (KPIN, KSEQ),
  INDEX IDX_FIM2_KSEQ (KSEQ),
  INDEX IDX_FIM2_SDS1 (SDS1),
  INDEX IDX_FIM2_SDS2 (SDS2),
  INDEX IDX_FIM2_SDS3 (SDS3),
  INDEX UK_FIM2_KPIN (KPIN)
)
ENGINE = INNODB
AVG_ROW_LENGTH = 180
CHARACTER SET latin1
COLLATE latin1_swedish_ci
COMMENT = 'Supplemental Part Descriptions';

给定的零件号可能没有记录或最多有8个记录。这八条记录包含字符串数据(描述,每条记录三条),我正试图从中转换:
KPIN  KSEQ  SDS1                                SDS2                            SDS3

R0205   1   COLD ROLLED A1008/A CS TYPE B       MATTE FINISH OIL / EXPOSED      MIN YIELD 30,000 RB 45-60
R0205   2   THICKNESS .032 TOL +.006/-.000      WIDTH 48.000 TOL +.188/-.000    LENGTH 84.875 TOL +/-.020
R0205   3   SQUARENESS TOL +/-.062              MAX LIFT WEIGHT 10,000 LBS      <null>

数据如下:
KPIN  KSEQ  DES1                                DES2                            DES3                        DES4                                DES5                            DES6                        DES7                                DES8                            DES9                        DES10                               DES11                           DES12

R0205   1   COLD ROLLED A1008/A CS TYPE B       MATTE FINISH OIL / EXPOSED      MIN YIELD 30,000 RB 45-60   THICKNESS .032 TOL +.006/-.000      WIDTH 48.000 TOL +.188/-.000    LENGTH 84.875 TOL +/-.020   THICKNESS .032 TOL +.006/-.000      WIDTH 48.000 TOL +.188/-.000    LENGTH 84.875 TOL +/-.020   SQUARENESS TOL +/-.062              MAX LIFT WEIGHT 10,000 LBS      <null><null><null><null>...

执行此操作的sql命令是:
    SELECT FIM1.KPIN, FIM1.DES, p.DES1, p.DES2, p.DES3, p.DES4, p.DES5, p.DES6,
                                p.DES7, p.DES8, p.DES9, p.DES10, p.DES11, p.DES12,
                                p.DES13, p.DES14, p.DES15, p.DES16, p.DES17, p.DES18,
                                p.DES19, p.DES20, p.DES21, p.DES22, p.DES23, p.DES24
    FROM FIM1
    JOIN
    (SELECT  KPIN, KSEQ,
      MAX(CASE WHEN (KSEQ = 1) THEN SDS1 END) AS DES1,
      MAX(CASE WHEN (KSEQ = 1) THEN SDS2 END) AS DES2,
      MAX(CASE WHEN (KSEQ = 1) THEN SDS3 END) AS DES3,
      MAX(CASE WHEN (KSEQ = 2) THEN SDS1 END) AS DES4,
      MAX(CASE WHEN (KSEQ = 2) THEN SDS2 END) AS DES5,
      MAX(CASE WHEN (KSEQ = 2) THEN SDS3 END) AS DES6,
      MAX(CASE WHEN (KSEQ = 3) THEN SDS1 END) AS DES7,
      MAX(CASE WHEN (KSEQ = 3) THEN SDS2 END) AS DES8,
      MAX(CASE WHEN (KSEQ = 3) THEN SDS3 END) AS DES9,
      MAX(CASE WHEN (KSEQ = 4) THEN SDS1 END) AS DES10,
      MAX(CASE WHEN (KSEQ = 4) THEN SDS2 END) AS DES11,
      MAX(CASE WHEN (KSEQ = 4) THEN SDS3 END) AS DES12,
      MAX(CASE WHEN (KSEQ = 5) THEN SDS1 END) AS DES13,
      MAX(CASE WHEN (KSEQ = 5) THEN SDS2 END) AS DES14,
      MAX(CASE WHEN (KSEQ = 5) THEN SDS3 END) AS DES15,
      MAX(CASE WHEN (KSEQ = 6) THEN SDS1 END) AS DES16,
      MAX(CASE WHEN (KSEQ = 6) THEN SDS2 END) AS DES17,
      MAX(CASE WHEN (KSEQ = 6) THEN SDS3 END) AS DES18,
      MAX(CASE WHEN (KSEQ = 7) THEN SDS1 END) AS DES19,
      MAX(CASE WHEN (KSEQ = 7) THEN SDS2 END) AS DES20,
      MAX(CASE WHEN (KSEQ = 7) THEN SDS3 END) AS DES21,
      MAX(CASE WHEN (KSEQ = 8) THEN SDS1 END) AS DES22,
      MAX(CASE WHEN (KSEQ = 8) THEN SDS2 END) AS DES23,
      MAX(CASE WHEN (KSEQ = 8) THEN SDS3 END) AS DES24

    FROM FIM2
    GROUP BY KPIN
    ORDER BY KPIN, KSEQ) as p
    ON FIM1.KPIN=p.KPIN ORDER BY FIM1.KPIN

表FIM1中有118552条记录(主要零件号列表)
表FIM2中有66303条记录(所有这些说明)
表FIM1中的62163条记录在FIM2中至少有一条描述记录。
上面写的查询在大约3秒钟内返回62163条记录。如果我将其改为左连接而不是连接,我将得到所有118552条包含大量空值的记录。手术需要几分钟几分钟才能完成。我想知道为什么这两种手术有这么大的区别。似乎左连接迫使引擎一次又一次地遍历fim2表,但是为什么对于简单连接不必这样做呢?我意识到我正在为每个项创建类似临时记录的东西,但是简单的连接仍然能够优化它。
该行合并技术基于以下详细概念:
How to pivot? How to convert multiple rows into one row with multiple columns?
我的查询计划:
(对于连接):
id  select_type table       type        possible_keys           key         key_len ref     rows    Extra
1   PRIMARY     <derived2>  ALL         (null)                  (null)      (null)  (null)  64512   Using temporary; Using filesort
1   PRIMARY     FIM1        eq_ref      PRIMARY,UK_FIM1_KPIN    PRIMARY     18      p.KPIN  1
2   DERIVED     FIM2        index       (null)                  PRIMARY     19      (null)  64085   Using temporary; Using filesort

对于左连接:
id  select_type table       type        possible_keys   key             key_len     ref     rows    Extra
1   PRIMARY     FIM1        index       (null)          IDX_FIM1_DES    30          (null)  124199  Using index; Using temporary; Using filesort
1   PRIMARY     <derived2>  ALL         (null)          (null)          (null)      (null)  64512
2   DERIVED     FIM2        index       (null)          PRIMARY         19          (null)  64085   Using temporary; Using filesort

最佳答案

啊,非标准化数据集的乐趣。这不是你问的问题,但是你能像这样重做这个FIM2表吗,所以每行只有一个项目,而不是三个?如果你这样做,你的生活会变得更容易。

KPIN  KSEQ  KFLD  ATTRIBUTE
R0205   1      1   COLD ROLLED A1008/A CS TYPE B
R0205   1      2   MATTE FINISH OIL / EXPOSED
R0205   1      3   MIN YIELD 30,000 RB 45-60
R0205   2      1   THICKNESS .032 TOL +.006/-.000
R0205   2      2   WIDTH 48.000 TOL +.188/-.000
R0205   2      3   LENGTH 84.875 TOL +/-.020
R0205   3      1   SQUARENESS TOL +/-.062
R0205   3      2   MAX LIFT WEIGHT 10,000 LBS

但是,这不是你要的。您希望改进报表程序,以便它能够在合理的时间内LEFT JOIN这两个表。
分解查询很重要。首先,尝试LEFT JOIN表和FIM1表,然后将结果集旋转以将内容扩展到一行。
从功能上讲,你的方法是先旋转,然后连接。让我们尝试另一种方法,先连接,然后旋转。这将减少满足查询所需的数字处理。
     SELECT ff.KPIN, ff.DES,
            a.SDS1 AS DES1,
            a.SDS2 AS DES2,
            a.SDS3 AS DES3,
            b.SDS1 AS DES4,
            b.SDS2 AS DES5,
            b.SDS3 AS DES6,
            c.SDS1 AS DES7,
            c.SDS2 AS DES8, ...
            h.SDS1 AS DES23,
            h.SDS2 AS DES24
       FROM FIM1 AS ff
  LEFT JOIN FIM2 AS a ON ff.KPIN = a.KPIN AND a.KSEQ=1
  LEFT JOIN FIM2 AS b ON ff.KPIN = b.KPIN AND b.KSEQ=2
  LEFT JOIN FIM2 AS c ON ff.KPIN = c.KPIN AND c.KSEQ=3
  LEFT JOIN FIM2 AS d ON ff.KPIN = d.KPIN AND d.KSEQ=4
  LEFT JOIN FIM2 AS e ON ff.KPIN = e.KPIN AND e.KSEQ=5
  LEFT JOIN FIM2 AS f ON ff.KPIN = f.KPIN AND f.KSEQ=6
  LEFT JOIN FIM2 AS g ON ff.KPIN = g.KPIN AND g.KSEQ=7
  LEFT JOIN FIM2 AS h ON ff.KPIN = h.KPIN AND h.KSEQ=8

如果您使用复合索引FIM2索引FIM2,这可能会非常快。不可否认,使用8个(KSEQ, KPIN)操作时,它看起来有些毛茸茸的,但是您的LEFT JOIN查询也是毛茸茸的。您还应该尝试索引MAX() ... GROUP BY以查看哪个更快。

10-06 10:52