问题描述
我有一个表(MainTable
),其中有600,000多条记录.它通过父/子类型关系中的第二个表(JoinTable
)连接到自身:
I have a table (MainTable
) with a bit over 600,000 records. It joins onto itself via a 2nd table (JoinTable
) in a parent/child type relationship:
SELECT Child.ID, Parent.ID
FROM MainTable
AS Child
JOIN JoinTable
ON Child.ID = JoinTable.ID
JOIN MainTable
AS Parent
ON Parent.ID = JoinTable.ParentID
AND Parent.SomeOtherData = Child.SomeOtherData
我知道每个子记录都有一个父记录,并且JoinTable中的数据是准确的.
I know that every child record has a parent record and the data in JoinTable is acurate.
运行此查询时,实际上需要几分钟的时间.但是,如果我使用左连接"加入父级",则<运行1秒钟:
When I run this query it takes literally minutes to run. However if I join to Parent using a Left Join then it takes < 1 second to run:
SELECT Child.ID, Parent.ID
FROM MainTable
AS Child
JOIN JoinTable
ON Child.ID = JoinTable.ID
LEFT JOIN MainTable
AS Parent
ON Parent.ID = JoinTable.ParentID
AND Parent.SomeOtherData = Child.SomeOtherData
WHERE ...[some info to make sure we don't select parent records in the child dataset]...
我了解INNER JOIN
和LEFT JOIN
之间的结果差异.在这种情况下,它返回的结果与每个孩子都有父母的情况完全相同.如果我让两个查询都运行,我可以比较数据集,它们是完全一样的.
I understand the difference in the results between an INNER JOIN
and a LEFT JOIN
. In this case it is returning exactly the same result as every child has a parent. If I let both queries run, I can compare the datasets and they are exactly the same.
为什么LEFT JOIN
的运行速度比INNER JOIN
快得多?
Why is it that a LEFT JOIN
runs so much faster than an INNER JOIN
?
更新检查查询计划,并在使用内部联接时从父数据集开始.进行左连接时,它从子数据集开始.
UPDATEChecked the query plans and when using an inner join it starts with the Parent dataset. When doing a left join it starts with the child dataset.
它使用的索引都是相同的.
The indexes it uses are all the same.
我可以强迫它总是从孩子开始吗?使用左联接有效,只是感觉不对.
Can I force it to always start with the child? Using a left join works, it just feels wrong.
之前曾在这里问过类似的问题,但似乎没有人回答我的问题.
Similar questions have been asked here before, but none seem to answer my question.
例如在> SQL Server中的内联接与左联接性能中选择的答案表示左联接总是比内联接慢.这种说法是有道理的,但这不是我所看到的.
e.g. the selected answer in INNER JOIN vs LEFT JOIN performance in SQL Server says that Left Joins are always slower than Inner joins. The argument makes sense, but it's not what I'm seeing.
推荐答案
左连接似乎更快,因为SQL被迫先执行较小的选择,然后再连接到此较小的记录集.由于某种原因,优化器不希望自然地这样做.
The Left join seems to be faster because SQL is forced to do the smaller select first and then join to this smaller set of records. For some reason the optimiser doesn't want to do this naturally.
三种强制连接以正确顺序发生的方法:
3 ways to force the joins to happen in the right order:
- 将第一个数据子集选择到临时表(或表变量)中,然后对其进行联接
- 使用左联接(请记住,这可能返回不同的数据,因为它是左联接而不是内部联接)
- 使用FORCE ORDER关键字.请注意,如果表大小或架构更改,则查询计划可能不正确(请参阅 https://dba. stackexchange.com/questions/45388/force-join-order )
这篇关于LEFT JOIN显着快于INNER JOIN的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!