为什么索引不与子查询一起使用

本文介绍了为什么索引不与子查询一起使用的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这需要花费0.001秒的时间来执行，并且使用索引查找

This takes 0.001 seconds to execute and it uses index seek

SELECT * FROM CUSTOMER WHERE ID IN (1008,1122)

现在我有一个存储过程U_VIP，它返回与示例一相同的ID(1008、1122)，并且只需要0.001秒即可执行

Now I have a stored procedure U_VIP which returns the same ID as example one (1008,1122), and it takes only 0.001 second to execute

SELECT ID FROM U_VIP    //returns (1008,1122)

现在，当我将它们组合在一起时，执行大约需要半秒钟，并且不使用索引

Now when I combine them, it takes around half-a-second to execute and index is not used

SELECT * FROM CUSTOMER WHERE ID IN (SELECT ID FROM U_VIP)

我已经简化了上面的示例，在实际应用中，性能受到更高幅度的影响.在这种情况下，如何强制Firebird使用索引?

I've simplified the example above, in actual application the performance is impacted by much higher magnitude. How to force Firebird to use index in this case?

**使用Firebird 2.1

**Using Firebird 2.1

**编辑**

基于Mark的答案，使用JOIN确实可以缩短执行时间，因为它现在正在执行索引查找.

Base on Mark's answer, use JOIN does improve the execution time because it is now doing index seek.

SELECT CUSTOMER.*
FROM CUSTOMER
INNER JOIN U_VIP ON U_VIP.ID = CUSTOMER.ID

这很好，但是，它为我带来了另一个问题，我将在以下示例中尝试解释.

This is great, however, it introduces another problem for me which I'll try to explain in the following example.

SELECT CUSTOMER.*
FROM CUSTOMER
WHERE (:AREAID = 0 OR ID IN (SELECT ID FROM U_VIP(:AREAID)))

使用where子句，我可以根据:AREAID是否由用户提供来有条件地应用过滤器.当我用联接替换where子句时，如何达到相同的目的?

Using where clause, I can conditionally apply the filter base on whether :AREAID is supplied by the user. How do I achieve the same when I replace the where clause with a join?

类似的东西:

SELECT CUSTOMER.*
FROM CUSTOMER
{IF :AREAID > 0 THEN}
INNER JOIN (SELECT ID FROM U_VIP(:AREAID)) VIP ON VIP.ID = CUSTOMER.ID
{END IF}

当然，火鸟不喜欢用大括号=/

Which of course, Firebird dislikes the part with braces =/

推荐答案

您需要使用EXISTS或INNER JOIN代替IN.我并不完全确定细节，但是我相信您的查询中的CUSTOMER表已被完全读取，可以评估每行子查询的结果(甚至可以对每行执行子查询).由于优化器无法预先知道子查询的结果数，因此无法像在第一次查询中那样使用固定数量的文字值来创建优化.

Instead of IN, you need to use EXISTS or an INNER JOIN. I am not entirely sure about the details, but I believe in your query the CUSTOMER table is fully read, evaluating the result of the subquery for every row (maybe even executing the subquery for every row). As the optimizer doesn't know the number of results of the subquery in advance, it can't create an optimization like it can if you use a fixed number of literal values like in your first query.

尝试将查询更改为:

SELECT *
FROM CUSTOMER
WHERE EXISTS (SELECT 1 FROM U_VIP WHERE U_VIP.ID = CUSTOMER.ID)

或者:

SELECT CUSTOMER.*
FROM CUSTOMER
INNER JOIN U_VIP ON U_VIP.ID = CUSTOMER.ID

或者(有时更改顺序可以带来更好的性能):

Or (changing the order sometimes can lead to better performance):

SELECT CUSTOMER.*
FROM U_VIP
INNER JOIN CUSTOMER ON CUSTOMER.ID = U_VIP.ID

通常，我希望这些查询的性能比使用IN的查询更好.

In general I would expect those queries to perform better than the query with IN.

编辑以响应更新

根据您更新的问题，我可以想到多种解决方案，但是我不确定它们的性能.

Based on your updated question I can think of multiple solutions, I am not entirely sure on their performance though.

对:AREAID为0和:AREAID不为0使用单独的查询
使用存储过程或 EXECUTE BLOCK 带有EXECUTE STATEMENT并带有动态构建的语句(前一个版本)
如果:AREAID为0，则使存储过程U_VIP返回所有客户
使用附加的JOIN条件OR :AREAID = 0;如果U_VIP的0不返回任何值(并且可能不执行*)，则可能不会产生结果
使用LEFT JOIN并添加WHERE U_VIP.ID IS NOT NULL OR :AREAID = 0(可能不执行*)
对CUSTOMER使用UNION的普通"查询，对CUSTOMER使用第二个查询(可能不执行*)

Use separate queries for :AREAID is 0 and :AREAID is not 0
Use a stored procedure or EXECUTE BLOCK with an EXECUTE STATEMENT with a dynamically built statement (variant of the previous)
Make the stored procedure U_VIP return all customers if :AREAID is 0
Use an additional JOIN condition OR :AREAID = 0; this might not yield results if U_VIP returns nothing for 0 (and might not perform *)
Use a LEFT JOIN and add WHERE U_VIP.ID IS NOT NULL OR :AREAID = 0 (might not perform *)
Use a UNION of the 'normal' query and a second query on CUSTOMER with WHERE :AREAID = 0 (might not perform *)

对于(*)，请参见智能逻辑" 反模式

对于动态构建的查询，您可以想到以下内容:

For the dynamically built query you can think of something like:

EXECUTE BLOCK (INPUTCONDITION INTEGER = ?)
    RETURNS (ID INTEGER)
AS
    DECLARE VARIABLE QUERY VARCHAR(6400);
BEGIN
    QUERY = 'SELECT a.ID FROM SORT_TEST a';
    IF (INPUTCONDITION <> 0) then
        QUERY = QUERY || ' WHERE a.ID = ' || INPUTCONDITION;
    FOR EXECUTE STATEMENT QUERY INTO :ID
    DO
        SUSPEND;
END

在此示例中，INPUTCONDITION的值0将生成不带WHERE子句的查询，而对于其他输入，具有WHERE子句的查询.如果参数为(VAR)CHAR或BLOB，则这样做很容易导致SQL注入，因此请小心.您还可以考虑两个分支，其中一个分支使用 EXECUTE STATEMENT 带参数，另一个不带参数.

In this example the value 0 for INPUTCONDITION will generate a query without WHERE-clause, and for other inputs a query with a WHERE-clause. Doing it like this is prone to SQL injection if the parameter is a (VAR)CHAR or BLOB, so be careful. You could also consider two branches where one uses EXECUTE STATEMENT with parameters and the other without.

代替EXECUTE BLOCK，您也可以使用可选的过程，就像已经用于U_VIP一样. EXECUTE BLOCK本质上是一个没有存储在数据库中的存储过程.

Instead of EXECUTE BLOCK, you can also use a selectable procedure like you already use for U_VIP; EXECUTE BLOCK is essentially a stored procedure that isn't stored in the database.

另请参见神话:动态SQL是慢"

这篇关于为什么索引不与子查询一起使用的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！