问题描述
我有一个包含大量数据的表,我需要将它与其他一些大表连接起来.
I have a table with a lot of data and I need to join it with some other large tables.
每次实际上只有一小部分表格与我相关.
Only a small portion of my table is actually relevant for me each time.
什么时候过滤我的数据最好?
When is it best to filter my data?
在 SQL 的 where 子句中.
In the where clause of the SQL.
用特定数据创建一个临时表,然后才加入它.
Create a temp table with the specific data and only then join it.
将谓词添加到第一个内连接 ON 子句中.
Add the predicate to the first inner join ON clause.
其他一些想法.
1.
Select *
From RealyBigTable
Inner Join AnotherBigTable On …
Inner Join YetAnotherBigTable On …
Where RealyBigTable.Type = ?
2.
Select *
Into #temp
From RealyBigTable
Where RealyBigTable.Type = ?
Select *
From #temp
Inner Join AnotherBigTable On …
Inner Join YetAnotherBigTable On …
3.
Select *
From RealyBigTable
Inner Join AnotherBigTable On RealyBigTable.type = ? And …
Inner Join YetAnotherBigTable On …
另一个问题:首先会发生什么?加入
还是哪里
?
Another question:What happens first? Join
or Where
?
推荐答案
因为您使用的是 INNER JOIN,所以 WHERE 或 JOIN 的争论只取决于您的品味和风格.就个人而言,我喜欢在 ON 子句中保留两个表之间的链接(例如外键约束),并在 WHERE 子句中对数据进行实际过滤.
Because you are using INNER JOINs the WHERE or JOIN debate only depends on your taste and style. Personally, I like to keep the links between the two tables (e.g. foreign key constraint) in the ON clause, and actual filters against data in the WHERE clause.
SQL Server 会将查询解析为相同的令牌树,因此将构建相同的查询执行计划.
SQL Server will parse the query into the same token tree, and will therefore build identical query execution plans.
如果您改用 [LEFT/RIGHT] OUTER JOINS,则会产生很大的不同,因为不仅性能可能不同,结果也很可能不同.回答您的其他问题:
If you were using [LEFT/RIGHT] OUTER JOINS instead, it makes a world of difference since not only is the performance probably different, but also very likely the results.
To answer your other questions:
什么时候过滤我的数据最好?
When is it best to filter my data?
- 在 SQL 的 where 子句中.
- 用特定数据创建一个临时表,然后才加入它.
- 将谓词添加到第一个内连接 ON 子句中.
- 其他一些想法.
在 WHERE 或 ON 子句中,两者被视为相同.对于 3,first 内连接"没有相关性.在多表 INNER JOIN 场景中,哪个先(在查询中)实际上并不重要,因为查询优化器会按照它认为合适的方式调整顺序.
In the WHERE or ON clause, both are seen as the same. For 3, the "first inner join" has no relevance. In a multi-table INNER JOIN scenario, it really doesn't matter which goes first (in the query), as the query optimizer will shuffle the order as it sees fit.
使用临时表是完全没有必要的,也无济于事,因为无论如何您都必须提取相关部分 - 这也是 JOIN 的作用.此外,如果您在 JOIN 条件/WHERE 过滤器上有一个很好的索引,则该索引将仅用于访问相关数据,而不查看表的其余部分.
Using a temp table is completely unnecessary and won't help, because you are having to extract the relevant portion anyway - which is what a JOIN would do as well. Moreover, if you had a good index on the JOIN conditions/WHERE filter, the index will be used to only visit the relevant data without looking at the rest of the table(s).
这篇关于SQL - 使用连接过滤大表 - 最佳实践的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!