我们有一个这样的查询。

SELECT DISTINCT table_a.userid       AS userId,
                table_a.screenname   AS screenName,
FROM   table_a
       LEFT JOIN table_b
              ON ( table_a.userid = table_b.userid )
WHERE  ( table_b.organizationid IS NULL )
       AND ( table_a.companyid = '20002' )
       AND ( table_a.defaultuser = 'f' )
       AND ( ( Lower(table_a.firstname) LIKE '%7189%'
               AND '%7189%' IS NOT NULL )
              OR ( Lower(table_a.middlename) LIKE '%7189%'
                   AND '%7189%' IS NOT NULL )
              OR ( Lower(table_a.lastname) LIKE '%7189%'
                   AND '%7189%' IS NOT NULL )
              OR ( Lower(table_a.screenname) LIKE '%7189%'
                   AND '%7189%' IS NOT NULL )
              OR ( Lower(table_a.emailaddress) LIKE '%7189%'
                   AND '%7189%' IS NOT NULL ) )
       AND ( table_a.status = '0' )

table_b.organizationid具有not NULL约束,因此table_b.organizationid不为NULL等于1=2,但postgres仍然执行查询的剩余部分,创建临时表并在15秒后返回零行。
如果不更改查询,是否可以提高此查询的性能。
DB版本是9.1,但9.3也有相同的行为。
Postgres版本:PostgreSQL 9.1.14,在x86_64-unknown-linux-gnu上,由gcc(Ubuntu/Linaro 4.6.3-1ubuntu5)4.6.3编译,64位
表a有230万行
表b有540万行
表a有40列(太多,无法在此列出)
表b有两列
     Column     |  Type  | Modifiers
----------------+--------+-----------
 userid         | bigint | not null
 organizationid | bigint | not null
Indexes:
    "table_b_pkey" PRIMARY KEY, btree (userid, organizationid)
    "ix_7ef4ec0e" btree (organizationid)
    "ix_fb646ca6" btree (userid)

解释计划
HashAggregate  (cost=638937.42..638937.43 rows=1 width=72) (actual time=21386.436..21386.436 rows=0 loops=1)
   ->  Hash Right Join  (cost=443599.10..638937.40 rows=1 width=72) (actual time=21386.433..21386.433 rows=0 loops=1)
         Hash Cond: (table_b.userid = table_a.userid)
         Filter: (table_b.organizationid IS NULL)
         ->  Seq Scan on table_b  (cost=0.00..95488.04 rows=6020704 width=16) (actual time=0.009..4158.880 rows=5497919 loops=1)
         ->  Hash  (cost=441059.52..441059.52 rows=104846 width=72) (actual time=12356.795..12356.795 rows=215 loops=1)
               Buckets: 16384  Batches: 2  Memory Usage: 12kB
               ->  Seq Scan on table_a  (cost=0.00..441059.52 rows=104846 width=72) (actual time=43.250..12355.735 rows=215 loops=1)
                     Filter: ((NOT defaultuser) AND (companyid = 20002::bigint) AND (status = 0) AND ((lower((firstname)::text) ~~ '%7189%'::text) OR (lower((middlename)::text) ~~ '%7189%'::text) OR (lower((lastname)::text) ~~ '%7189%'::text) OR (lower((screenname)::text) ~~ '%7189%'::text) OR (lower((emailaddress)::text) ~~ '%7189%'::text)))
 Total runtime: 21386.608 ms
(10 rows)

--萨梅尔

最佳答案

我知道你不想重写查询。但查询重写对计划员有很大帮助。我会把这个问题写成:
SELECT DISTINCT table_a.userid AS userId, table_a.screenname AS screenName,FROM table_aLEFT JOIN(select userid from --i think only userid is needed in this query table_b table_b.organizationid IS NULL ) as table_b ON ( table_a.userid = table_b.userid )WHERE
( table_a.companyid = '20002' ) AND ( table_a.defaultuser = 'f' ) AND ( ( Lower(table_a.firstname) LIKE '%7189%' AND '%7189%' IS NOT NULL ) OR ( Lower(table_a.middlename) LIKE '%7189%' AND '%7189%' IS NOT NULL ) OR ( Lower(table_a.lastname) LIKE '%7189%' AND '%7189%' IS NOT NULL ) OR ( Lower(table_a.screenname) LIKE '%7189%' AND '%7189%' IS NOT NULL ) OR ( Lower(table_a.emailaddress) LIKE '%7189%' AND '%7189%' IS NOT NULL ) ) AND ( table_a.status = '0' )

关于sql - 当条件始终为false时的postgres查询优化,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/36760305/

10-14 02:15