问题描述
以下查询需要很长时间才能完成.但是,如果我删除 top 10 子句,它会很快完成.big_table_1 和 big_table_2 是 2 个表,有 10^5 条记录.
The following query takes forever to finish. But if I remove the top 10 clause, it finishs rather quickly. big_table_1 and big_table_2 are 2 tables with 10^5 records.
我曾经相信 top 子句 会减少时间成本,但它显然不是在这里.为什么???
I used to believe that top clause will reduce the time cost, but it's apparently not here. Why???
select top 10 ServiceRequestID
from
(
(select *
from big_table_1
where big_table_1.StatusId=2
) cap1
inner join
big_table_2 cap2
on cap1.ServiceRequestID = cap2.CustomerReferenceNumber
)
推荐答案
还有关于同一主题的其他 stackoverflow 讨论(链接在底部).正如上面的评论中所指出的,它可能与索引和优化器混淆并使用错误的索引有关.
There are other stackoverflow discussions on this same topic (links at bottom). As noted in the comments above it might have something to do with indexes and the optimizer getting confused and using the wrong one.
我的第一个想法是您正在执行 select top serviceid from (select *....) 并且优化器可能难以将查询下推到内部查询并使用索引.
My first thought is that you are doing a select top serviceid from (select *....) and the optimizer may have difficulty pushing the query down to the inner queries and making using of the index.
考虑将其重写为
select top 10 ServiceRequestID
from big_table_1
inner join big_table_2 cap2
on cap1.servicerequestid = cap2.customerreferencenumber
and big_table_1.statusid = 2
在您的查询中,数据库可能正在尝试合并结果并返回它们,然后将其限制为外部查询中的前 10 个.在上述查询中,数据库只需要在合并结果时收集前 10 个结果,从而节省大量时间.并且如果 servicerequestID 被索引,它一定会使用它.在您的示例中,查询正在以虚拟、未编入索引的格式返回的结果集中查找 servicerequestid 列.
In your query, the database is probably trying to merge the results and return them and THEN limit it to the top 10 in the outer query. In the above query the database will only have to gather the first 10 results as results are being merged, saving loads of time. And if servicerequestID is indexed, it will be sure to use it. In your example, the query is looking for the servicerequestid column in a result set that has already been returned in a virtual, unindexed format.
希望这是有道理的.虽然假设优化器应该采用我们放入 SQL 的任何格式并找出每次返回值的最佳方式,但事实是我们将 SQL 放在一起的方式确实会影响某些步骤在数据库.
Hope that makes sense. While hypothetically the optimizer is supposed to take whatever format we put SQL in and figure out the best way to return values every time, the truth is that the way we put our SQL together can really impact the order in which certain steps are done on the DB.
为什么这样做SQL Server 中索引列上的 top(1) 慢吗?
这篇关于为什么选择 Top 子句会导致很长的时间成本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!