说明
根据explain
命令,有一个范围导致查询执行完整表扫描(160k行)。如何保持范围条件并减少扫描?我认为罪魁祸首是:
Y.YEAR BETWEEN 1900 AND 2009 AND
代码
下面是具有范围条件的代码(
STATION_DISTRICT
可能是多余的)。SELECT
COUNT(1) as MEASUREMENTS,
AVG(D.AMOUNT) as AMOUNT,
Y.YEAR as YEAR,
MAKEDATE(Y.YEAR,1) as AMOUNT_DATE
FROM
CITY C,
STATION S,
STATION_DISTRICT SD,
YEAR_REF Y FORCE INDEX(YEAR_IDX),
MONTH_REF M,
DAILY D
WHERE
-- For a specific city ...
--
C.ID = 10663 AND
-- Find all the stations within a specific unit radius ...
--
6371.009 *
SQRT(
POW(RADIANS(C.LATITUDE_DECIMAL - S.LATITUDE_DECIMAL), 2) +
(COS(RADIANS(C.LATITUDE_DECIMAL + S.LATITUDE_DECIMAL) / 2) *
POW(RADIANS(C.LONGITUDE_DECIMAL - S.LONGITUDE_DECIMAL), 2)) ) <= 50 AND
-- Get the station district identification for the matching station.
--
S.STATION_DISTRICT_ID = SD.ID AND
-- Gather all known years for that station ...
--
Y.STATION_DISTRICT_ID = SD.ID AND
-- The data before 1900 is shaky; insufficient after 2009.
--
Y.YEAR BETWEEN 1900 AND 2009 AND
-- Filtered by all known months ...
--
M.YEAR_REF_ID = Y.ID AND
-- Whittled down by category ...
--
M.CATEGORY_ID = '003' AND
-- Into the valid daily climate data.
--
M.ID = D.MONTH_REF_ID AND
D.DAILY_FLAG_ID <> 'M'
GROUP BY
Y.YEAR
更新
SQL正在执行全表扫描,这将导致MySQL执行“复制到tmp表”,如下所示:
+----+-------------+-------+--------+-----------------------------------+--------------+---------+-------------------------------+--------+-------------+
|id |选择|类型|表格|类型|可能的|键|键|长度|参考|行|额外|
+----+-------------+-------+--------+-----------------------------------+--------------+---------+-------------------------------+--------+-------------+
|1 |简单| C |常量|主要|主要| 4 |常量| 1 ||
|1 |简单| Y |范围|年份|年份|年份| IDX | 4 |空| 160422 |使用|
|1 | SIMPLE | SD | eq | ref | PRIMARY | PRIMARY | 4 | climate.Y.STATION | DISTRICT | ID | 1 |使用索引|
|1 | SIMPLE | S | eq | ref | PRIMARY | PRIMARY | 4 | climate.SD.ID | 1 |使用where|
|1 | SIMPLE | M | ref | PRIMARY,YEAR | ref | u IDX,CATEGORY | YEAR | ref | u IDX | 8 | climate.Y.ID | 54 |使用where|
|1 | SIMPLE | D | ref | INDEX | INDEX | 8 | climate.M.ID | 11 |使用|
+----+-------------+-------+--------+-----------------------------------+--------------+---------+-------------------------------+--------+-------------+
回答
使用
STRAIGHT_JOIN
之后:+----+-------------+-------+--------+-----------------------------------+---------------+---------+-------------------------------+------+---------------------------------+
|id |选择|类型|表格|类型|可能的|键|键|长度|参考|行|额外|
+----+-------------+-------+--------+-----------------------------------+---------------+---------+-------------------------------+------+---------------------------------+
|1 | SIMPLE | C | const | PRIMARY | PRIMARY | 4 | const | 1 |使用临时;使用文件排序|
|1 |简单| S |全部|主要|空|空| 7795 |使用|
|1 | SIMPLE | SD | eq | ref | PRIMARY | PRIMARY | 4 | climate.S.STATION | DISTRICT | ID | 1 |使用索引|
|1 | SIMPLE | Y | ref | PRIMARY,STAT | YEAR | IDX | STAT | YEAR | IDX | 4 | climate.S.STATION | DISTRICT | 1650 |使用where|
|1 | SIMPLE | M | ref | PRIMARY,YEAR | ref | u IDX,CATEGORY | YEAR | ref | u IDX | 8 | climate.Y.ID | 54 |使用where|
|1 | SIMPLE | D | ref | INDEX | INDEX | 8 | climate.M.ID | 11 |使用|
+----+-------------+-------+--------+-----------------------------------+---------------+---------+-------------------------------+------+---------------------------------+
相关的
http://dev.mysql.com/doc/refman/5.0/en/how-to-avoid-table-scan.html
http://dev.mysql.com/doc/refman/5.0/en/where-optimizations.html
Optimize SQL that uses between clause
谢谢您!
最佳答案
一个请求。。。看起来你知道你的数据。添加关键字“直线连接”并查看结果。。。
选择直接连接。。。剩下的问题。。。
直接连接告诉MySql按照我列出的那样做。所以,你的城市表是来自列表中的第一个,这表明你希望它是你的主要。。。另外,城市的WHERE子句是直接过滤器。尽管如此,它可能会在查询的其余部分中飞行。。。
希望有帮助。。。它为我提供了数百万条被查询记录的政府数据,并加入到mySql试图为我考虑的10多个查找表中。
关于sql - 消除了由于BETWEEN(和GROUP BY)引起的全表扫描,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/2815669/