mysql - MYSQL持续时间慢或获取时间取决于“distinct”命令

Select distinct AUDSpot1yFq.refDate,AUDSpot1yFq.rate as 'AUDSpot1yFq',
AUD1y1yFq.rate as AUD1y1yFq
from audratedb AUDSpot1yFq inner join audratedb AUD1y1yFq on
AUDSpot1yFq.refDate=AUD1y1yFq.refDate
where AUDSpot1yFq.instrument = 'AUDSpot1yFq' and
AUD1y1yFq.instrument = 'AUD1y1yFq'
order by AUDSpot1yFq.refDate

注意，在下面这个特定的计时查询中，我实际上得到了10种不同的工具，这意味着查询时间更长，但是遵循相同的命名方式，内部联接和where语句。

这很慢，在工作台中，我将其时间设置为7-8秒（但由于在运行服务器的计算机上有工作台，因此获取时间接近0）。当我剥离不同的东西时，持续时间下降到0.25-0.5秒（更容易管理），而当我剥离“命令依据”时，它的速度甚至更快（
当我对缩减查询（具有可怕的获取时间）运行解释时，我得到：

1   SIMPLE  AUDSpot1yFq     ref unique_ID,refDate,instrument    instrument  39  const   1432    100.00  Using where
1   SIMPLE  AUD1y1yFq       ref unique_ID,refDate,instrument    unique_ID   42  historicalratesdb.AUDSpot1yFq.refDate,const 1   100.00  Using where
1   SIMPLE  AUD2y1yFq       ref unique_ID,refDate,instrument    unique_ID   42  historicalratesdb.AUDSpot1yFq.refDate,const 1   100.00  Using where
1   SIMPLE  AUD3y1yFq       ref unique_ID,refDate,instrument    unique_ID   42  historicalratesdb.AUDSpot1yFq.refDate,const 1   100.00  Using where
1   SIMPLE  AUD4y1yFq       ref unique_ID,refDate,instrument    unique_ID   42  historicalratesdb.AUDSpot1yFq.refDate,const 1   100.00  Using where
1   SIMPLE  AUD5y1yFq       ref unique_ID,refDate,instrument    unique_ID   42  historicalratesdb.AUDSpot1yFq.refDate,const 1   100.00  Using where
1   SIMPLE  AUD6y1yFq       ref unique_ID,refDate,instrument    unique_ID   42  historicalratesdb.AUDSpot1yFq.refDate,const 1   100.00  Using where
1   SIMPLE  AUD7y1yFq       ref unique_ID,refDate,instrument    unique_ID   42  historicalratesdb.AUDSpot1yFq.refDate,const 1   100.00  Using where
1   SIMPLE  AUD8y1yFq       ref unique_ID,refDate,instrument    unique_ID   42  historicalratesdb.AUDSpot1yFq.refDate,const 1   100.00  Using where
1   SIMPLE  AUD9y1yFq       ref unique_ID,refDate,instrument    unique_ID   42  historicalratesdb.AUDSpot1yFq.refDate,const 1   100.00  Using where

现在，我意识到不需要非同寻常的东西了，当我将输出数据输出到数据帧时，可以将order by扔进去并按大熊猫排序。这太棒了。但是我不知道如何缩短获取时间。我不会在该网站上赢得任何胜任力竞赛，但是我已经尽我所能搜索并且找不到该问题的解决方案。任何帮助是极大的赞赏。

〜可可

最佳答案

（为了阅读它，我不得不简化表别名：）

Select  distinct
           s.refDate,
           s.rate as AUDSpot1yFq,
           y.rate as AUD1y1yFq
    from  audratedb AS s
    join  audratedb AS y  on s.refDate = y.refDate
    where  s.instrument = 'AUDSpot1yFq'
      and  y.instrument = 'AUD1y1yFq'
    order by  s.refDate

所需索引：

INDEX(instrument, refDate)  -- To filter and sort, or
INDEX(instrument, refDate, rate)  -- to also "cover" the query.

假设查询没有比您说的复杂。我看到EXPLAIN已经有很多表了。请提供SHOW CREATE TABLE audratedb和整个SELECT。

回到您的问题...

DISTINCT是通过以下两种方法之一完成的：（1）对表进行排序，然后对dedup进行排序，或者（2）在内存中的哈希中对dedup进行排序。请记住，您要重复使用所有3列（refDate，s.rate和y.rate）。

ORDER BY是收集所有数据后的一种。但是，使用建议的索引（而不是您拥有的索引），则不需要排序，因为索引将按所需顺序获取行。

但是...同时具有DISTINCT和ORDER BY可能会使优化器混淆到执行“愚蠢”操作的地步。

您说(refDate,instrument)是UNIQUE，但是您没有提及PRIMARY KEY，也没有提及正在使用的引擎。如果您使用的是InnoDB，则按此顺序的PRIMARY KEY(instrument, refDate)将进一步加快处理速度，并避免需要任何新索引。

此外，同时具有(a,b)和(a)是多余的。也就是说，您当前的架构不需要INDEX(refDate)，但是通过更改PK，您将不需要INDEX(instrument)。

底线：仅

PRIMARY KEY(instrument, refDate),
INDEX(refDate)

并且没有其他索引（除非您可以显示一些需要它的查询）。

EXPLAIN上的更多内容。注意Rows列的显示方式是1432、1、1，...，这意味着它扫描了第一张表的估计的1432行。由于缺少适当的索引，这可能远远超出了必要。然后，它只需要查看其他表中的每一行。（再好不过了。）

SELECT中有多少行没有DISTINCT或ORDER BY？这告诉您在完成提取和JOINing之后需要进行多少工作。我怀疑只有少数。对于“ DISTINCT”和“ ORDER BY”，“很少”确实很便宜；因此，我认为您在吠错树。甚至1432行也将非常快速地处理。

至于buffer_pool ...桌子有多大？执行SHOW TABLE STATUS。我怀疑该表超过1GB，因此无法容纳在buffer_pool中。因此，提高缓存大小将使查询在RAM中运行，而不是击中磁盘（至少在缓存之后）。请记住，在冷缓存上运行查询将有很多I / O。随着缓存的预热，查询将运行得更快。但是，如果缓存太小，您将继续需要I / O。 I / O是处理中最慢的部分。

我希望您至少有6GB的RAM。否则，2G可能非常危险。交换确实对性能不利。

关于mysql - MYSQL持续时间慢或获取时间取决于“distinct”命令，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/34981715/