问题描述
简化表:
CREATE TABLE products (
product_no integer PRIMARY KEY,
sales integer,
status varchar(16),
category varchar(16));
CREATE INDEX index_products_sales ON products (sales);
CREATE INDEX index_products_status ON products (status);
CREATE INDEX index_products_category ON products (category);
PostgreSQL的版本是8.4。 状态和类别列
PostgreSQL version is 8.4. Columns 'status' and 'category'
有2000万种产品/行分布在15个类别中。
There are 20 million products/rows spread across 15 categories.
使用最频繁的查询之一是获取销量最高的三个产品,不包括 cat3和 cat7类别中的产品:
One of the most used queries is getting the three most sold products, excluding products in categories 'cat3' and 'cat7':
SELECT product_no, sales
FROM products
WHERE status = 'something' AND category NOT IN ('cat3', 'cat7')
ORDER BY sales DESC
LIMIT 3;
Limit (cost=0.00..8833.39 rows=3 width=12) (actual time=9235.332..9356.284 rows=3 loops=1)
-> Index Scan using index_products_sales on products (cost=0.00..68935806.85 rows=23412 width=12) (actual time=9235.327..9356.278 rows=3 loops=1)
Filter: (((category)::text <> ALL ('{cat3,cat7}'::text[])) AND ((status)::text = 'something'::text))
使此特定查询运行更快的最佳索引是什么?
What would be the best index for making this specific query run faster?
推荐答案
使用以下特定排序顺序创建部分多列索引:
Create a partial, multicolumn index with this particular sort order:
CREATE INDEX products_status_sales_partial_idx ON products (status, sales DESC)
WHERE category NOT IN ('cat3','cat7');
稍微修改您的查询:
SELECT product_no, sales
FROM products
WHERE status = 'something'
AND category NOT IN ('cat3', 'cat7')
ORDER BY status, sales DESC
LIMIT 3;
首先添加状态
ORDER BY
子句的元素似乎多余且毫无意义。但请尝试一下。
Adding status
as first element of the ORDER BY
clause seems redundant and pointless. But give it a try.
查询计划器的智能不足以理解
The query planner is not smart enough to understand, that with
WHERE status = 'something' ...
ORDER BY sales DESC
索引(状态,销售DESC)
的排序顺序匹配。因此,它将读取所有个符合条件的行,并排序并选择前3个。
the sort order of the index (status, sales DESC)
matches as a logical consequence. So it is going to read all qualifying rows, sort and pick the top 3.
通过添加状态
到 ORDER BY
,您可以使查询计划程序直接从索引中读取前3个条目。预计会加快几个数量级。
By adding status
to the ORDER BY
you enable the query planner to read the top 3 entries from the index directly. Expect a speed-up by several orders of magnitude.
已通过PostgreSQL 8.4和9.1测试。
Tested with PostgreSQL 8.4 and 9.1.
这篇关于为PostgreSQL查询选择正确的索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!