问题描述
我编写了一个表值UDF,该值以CTE开头,以从大型表返回行的子集。
CTE中有多个联接。一些内部表和一个左侧表连接到其他表,这些表不包含很多行。
CTE具有where子句,该子句返回日期范围内的行,以便仅返回所需的行。
I have written a table-valued UDF that starts by a CTE to return a subset of the rows from a large table.There are several joins in the CTE. A couple of inner and one left join to other tables, which don't contain a lot of rows.The CTE has a where clause that returns the rows within a date range, in order to return only the rows needed.
然后我引用此CTE在4个自左联接中,为了使用不同的条件建立小计。
I'm then referencing this CTE in 4 self left joins, in order to build subtotals using different criterias.
查询是相当复杂的,但这是它的简化伪版本
The query is quite complex but here is a simplified pseudo-version of it
WITH DataCTE as
(
SELECT [columns] FROM table
INNER JOIN table2
ON [...]
INNER JOIN table3
ON [...]
LEFT JOIN table3
ON [...]
)
SELECT [aggregates_columns of each subset] FROM DataCTE Main
LEFT JOIN DataCTE BananasSubset
ON [...]
AND Product = 'Bananas'
AND Quality = 100
LEFT JOIN DataCTE DamagedBananasSubset
ON [...]
AND Product = 'Bananas'
AND Quality < 20
LEFT JOIN DataCTE MangosSubset
ON [...]
GROUP BY [
我感到SQL Server感到困惑,并为每个自连接调用CTE,这似乎通过查看执行计划得到了证实,尽管我承认不是阅读那些专家。
I have the feeling that SQL Server gets confused and calls the CTE for each self join, which seems confirmed by looking at the execution plan, although I confess not being an expert at reading those.
我本以为SQL Server足够聪明,只能从CTE执行一次数据检索,而不是执行几次。
I would have assumed SQL Server to be smart enough to only perform the data retrieval from the CTE only once, rather than do it several times.
我尝试了相同的方法,但是没有使用CTE来获取数据的子集,而是使用了与CTE中相同的选择查询,而是将其输出到临时表中。
I have tried the same approach but rather than using a CTE to get the subset of the data, I used the same select query as in the CTE, but made it output to a temp table instead.
引用CTE版本的版本需要40秒。引用临时表的版本需要1到2秒钟。
The version referring the CTE version takes 40 seconds. The version referring the temp table takes between 1 and 2 seconds.
为什么SQL Server不够聪明,无法将CTE结果保存在内存中?
Why isn't SQL Server smart enough to keep the CTE results in memory?
我喜欢CTE,在这种情况下尤其如此,因为我的UDF是表值的,因此它使我可以将所有内容保留在一个语句中。
I like CTEs, especially in this case as my UDF is a table-valued one, so it allowed me to keep everything in a single statement.
要使用临时表,我需要编写一个多语句表,其值是UDF,我发现它的解决方案稍微不太优雅。
To use a temp table, I would need to write a multi-statement table valued UDF, which I find a slightly less elegant solution.
您在CTE中遇到了这种性能问题,如果是这样,您如何对它们进行排序?
Did some of you had this kind of performance issues with CTE, and if so, how did you get them sorted?
谢谢
Kharlos
推荐答案
我相信每次都会检索CTE结果。使用临时表,结果将被存储直到被删除。这似乎可以解释您切换到临时表时看到的性能提升。
I believe that CTE results are retrieved every time. With a temp table the results are stored until it is dropped. This would seem to explain the performance gains you saw when you switched to a temp table.
另一个好处是,您可以在临时表上创建索引,而无法创建索引做一个CTE。不知道您的情况是否有好处,但是很高兴知道。
Another benefit is that you can create indexes on a temporary table which you can't do to a cte. Not sure if there would be a benefit in your situation but it's good to know.
相关阅读:
- Which are more performant, CTE or temporary tables?
- SQL 2005 CTE vs TEMP table Performance when used in joins of other tables
- http://msdn.microsoft.com/en-us/magazine/cc163346.aspx#S3
引用最后一个链接:
我会说去临时表。不幸的是,优雅并不总是最好的解决方案。
I'd say go with the temp table. Unfortunately elegant isn't always the best solution.
更新:
嗯这使事情变得更加困难。如果不考虑整个环境,我很难说。
Hmmm that makes things more difficult. It's hard for me to say with out looking at your whole environment.
一些想法:
- 可以使用存储过程代替UDF吗? (而不是从内部)?
- 这可能无法实现,但是如果您可以从CTE中删除
左联接
,则可以将其移至索引视图中。如果您能够做到这一点,那么即使在临时表上,您也可能会看到性能提升。
- can you use a stored procedure instead of a UDF (instead, not from within)?
- This may not be possible but if you can remove the
left join
from you CTE you could move that into an indexed view. If you are able to do this you may see performance gains over even the temp table.
这篇关于自连接中引用的SQL Server CTE速度很慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!