问题描述
- 我应该何时在数据库中使用复合索引?
- 使用
复合索引的性能分支是什么? - 我为什么要使用复合索引?
例如,我有一个 homes
table:
For example, I have a homes
table:
CREATE TABLE IF NOT EXISTS `homes` (
`home_id` int(10) unsigned NOT NULL auto_increment,
`sqft` smallint(5) unsigned NOT NULL,
`year_built` smallint(5) unsigned NOT NULL,
`geolat` decimal(10,6) default NULL,
`geolng` decimal(10,6) default NULL,
PRIMARY KEY (`home_id`),
KEY `geolat` (`geolat`),
KEY `geolng` (`geolng`),
) ENGINE=InnoDB ;
对于 geolat 和
geolng
,这样:
我替换:
KEY `geolat` (`geolat`),
KEY `geolng` (`geolng`),
with:
KEY `geolat_geolng` (`geolat`, `geolng`)
如果是这样的话:
- 为什么?
- 使用复合索引的性能分支是什么?)
更新:
由于很多人都声明它完全依赖于我执行的查询,因此以下是执行的最常见查询:
Since many people have stated it entirely dependent upon the queries I perform, below is the most common query performed:
SELECT * FROM homes
WHERE geolat BETWEEN ??? AND ???
AND geolng BETWEEN ??? AND ???
更新2:
以下内容数据库模式:
CREATE TABLE IF NOT EXISTS `homes` (
`home_id` int(10) unsigned NOT NULL auto_increment,
`primary_photo_group_id` int(10) unsigned NOT NULL default '0',
`customer_id` bigint(20) unsigned NOT NULL,
`account_type_id` int(11) NOT NULL,
`address` varchar(128) collate utf8_unicode_ci NOT NULL,
`city` varchar(64) collate utf8_unicode_ci NOT NULL,
`state` varchar(2) collate utf8_unicode_ci NOT NULL,
`zip` mediumint(8) unsigned NOT NULL,
`price` mediumint(8) unsigned NOT NULL,
`sqft` smallint(5) unsigned NOT NULL,
`year_built` smallint(5) unsigned NOT NULL,
`num_of_beds` tinyint(3) unsigned NOT NULL,
`num_of_baths` decimal(3,1) unsigned NOT NULL,
`num_of_floors` tinyint(3) unsigned NOT NULL,
`description` text collate utf8_unicode_ci,
`geolat` decimal(10,6) default NULL,
`geolng` decimal(10,6) default NULL,
`display_status` tinyint(1) NOT NULL,
`date_listed` timestamp NOT NULL default CURRENT_TIMESTAMP,
`contact_email` varchar(100) collate utf8_unicode_ci NOT NULL,
`contact_phone_number` varchar(15) collate utf8_unicode_ci NOT NULL,
PRIMARY KEY (`home_id`),
KEY `customer_id` (`customer_id`),
KEY `city` (`city`),
KEY `num_of_beds` (`num_of_beds`),
KEY `num_of_baths` (`num_of_baths`),
KEY `geolat` (`geolat`),
KEY `geolng` (`geolng`),
KEY `account_type_id` (`account_type_id`),
KEY `display_status` (`display_status`),
KEY `sqft` (`sqft`),
KEY `price` (`price`),
KEY `primary_photo_group_id` (`primary_photo_group_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=8 ;
使用以下SQL:
EXPLAIN SELECT homes.home_id,
address,
city,
state,
zip,
price,
sqft,
year_built,
account_type_id,
num_of_beds,
num_of_baths,
geolat,
geolng,
photo_id,
photo_url_dir
FROM homes
LEFT OUTER JOIN home_photos ON homes.home_id = home_photos.home_id
AND homes.primary_photo_group_id = home_photos.home_photo_group_id
AND home_photos.home_photo_type_id = 2
WHERE homes.display_status = true
AND homes.geolat BETWEEN -100 AND 100
AND homes.geolng BETWEEN -100 AND 100
EXPLAIN返回:
EXPLAIN returns:
id select_type table type possible_keys key key_len ref rows Extra
----------------------------------------------------------------------------------------------------------
1 SIMPLE homes ref geolat,geolng,display_status display_status 1 const 2 Using where
1 SIMPLE home_photos ref home_id,home_photo_type_id,home_photo_group_id home_photo_group_id 4 homes.primary_photo_group_id 4
I不太了解如何阅读EXPLAIN命令。这看起来好还是坏。现在,我没有使用geolat和geolng的综合指数。我应该吗?
I don't quite understand how to read the EXPLAIN command. Does this look good or bad. Right now, I am NOT using a composite index for geolat and geolng. Should I be?
推荐答案
当您使用从中受益的查询时,您应该使用复合索引。一个如下所示的复合索引:
You should use a composite index when you are using queries that benefit from it. A composite index that looks like this:
index( column_A, column_B, column_C )
将使用这些字段进行加入,过滤和有时选择的查询受益。它还将使在该组合中使用最左侧列的子集的查询受益。所以上面的索引也将满足需要的查询
will benefit a query that uses those fields for joining, filtering, and sometimes selecting. It will also benefit queries that use left-most subsets of columns in that composite. So the above index will also satisfy queries that need
index( column_A, column_B, column_C )
index( column_A, column_B )
index( column_A )
但它不会(至少不是直接的,也许是它可以部分帮助,如果没有更好的索引)帮助查询需要
But it will not (at least not directly, maybe it can help partially if there are no better indices) help for queries that need
index( column_A, column_C )
注意column_B是如何丢失的。
Notice how column_B is missing.
在原始示例中,两个维度的复合索引将主要有益于查询两个维度或最左边的维度本身,而不是最右边的维度本身。如果你总是在查询两个维度,那么复合索引是可行的方法,首先(最有可能)并不重要。
In your original example, a composite index for two dimensions will mostly benefit queries that query on both dimensions or the leftmost dimension by itself, but not the rightmost dimension by itself. If you're always querying two dimensions, a composite index is the way to go, doesn't really matter which is first (most probably).
这篇关于我什么时候应该使用复合索引?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!