我有一个不同对象之间的连接表,我基本上是在尝试使用自连接进行图形遍历。
我的表定义为:
CREATE TABLE `connections` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`position` int(11) NOT NULL,
`dId` bigint(20) NOT NULL,
`sourceId` bigint(20) NOT NULL,
`targetId` bigint(20) NOT NULL,
`type` bigint(20) NOT NULL,
`weight` float NOT NULL DEFAULT '1',
`refId` bigint(20) NOT NULL,
`ts` bigint(20) NOT NULL,
PRIMARY KEY (`id`),
KEY `sourcetype` (`type`,`sourceId`,`targetId`),
KEY `targettype` (`type`,`targetId`,`sourceId`),
KEY `complete` (`dId`,`sourceId`,`targetId`,`type`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
该表包含大约 3M 个条目(类型 1 约 1K,类型 2 约 1M,类型 3 约 2M)。
超过 2 或 3 跳的查询实际上非常快(当然需要一段时间才能收到所有结果),但是获取 3 跳的查询计数非常慢(> 30 秒)。
这是查询(返回 2M):
SELECT
count(*)
FROM
`connections` AS `t0`
JOIN
`connections` AS `t1` ON `t1`.`targetid`=`t0`.`sourceid`
JOIN
`connections` AS `t2` ON `t2`.`targetid`=`t1`.`sourceid`
WHERE
`t2`.dId = 1
AND
`t2`.`sourceid` = 1
AND
`t2`.`type` = 1
AND
`t1`.`type` = 2
AND
`t0`.`type` = 3;
这是相应的EXPLAIN:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE t2 ref targettype,complete,sourcetype complete 16 const,const 100 Using where; Using index
1 SIMPLE t1 ref targettype,sourcetype targettype 8 const 2964 Using where; Using index
1 SIMPLE t0 ref targettype,sourcetype sourcetype 16 const,travtest.t1.targetId 2964 Using index
编辑:这是添加和索引
type
后的解释:id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE t2 ref type,complete,sourcetype,targettype complete 16 const,const 100 Using where; Using index
1 SIMPLE t1 ref type,sourcetype,targettype sourcetype 16 const,travtest.t2.targetId 2 Using index
1 SIMPLE t0 ref type,sourcetype,targettype sourcetype 16 const,travtest.t1.targetId 2 Using index
有没有办法改善这种情况?
第二次编辑:
EXPLAN EXTENDED:
+----+-------------+-------+------+-------------------------------------+------------+---------+----------------------------+------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+-------------------------------------+------------+---------+----------------------------+------+----------+--------------------------+
| 1 | SIMPLE | t2 | ref | type,complete,sourcetype,targettype | complete | 16 | const,const | 100 | 100.00 | Using where; Using index |
| 1 | SIMPLE | t1 | ref | type,sourcetype,targettype | sourcetype | 16 | const,travtest.t2.targetId | 1 | 100.00 | Using index |
| 1 | SIMPLE | t0 | ref | type,sourcetype,targettype | sourcetype | 16 | const,travtest.t1.targetId | 1 | 100.00 | Using index |
+----+-------------+-------+------+-------------------------------------+------------+---------+----------------------------+------+----------+--------------------------+
SHOW WARNINGS;
+-------+------+--------------------------------------------------------------------------------------------+
| Level | Code | Message |
+-------+------+--------------------------------------------------------------------------------------------+
| Note | 1003 | /* select#1 */ select count(0) AS `count(*)` from `travtest`.`connections` `t0` |
| | | join `travtest`.`connections` `t1` join `travtest`.`connections` `t2` |
| | | where ((`travtest`.`t0`.`sourceId` = `travtest`.`t1`.`targetId`) and |
| | | (`travtest`.`t1`.`sourceId` = `travtest`.`t2`.`targetId`) and (`travtest`.`t0`.`type` = 3) |
| | | and (`travtest`.`t1`.`type` = 2) and (`travtest`.`t2`.`type` = 1) and |
| | | (`travtest`.`t2`.`sourceId` = 1) and (`travtest`.`t2`.`dId` = 1)) |
+-------+------+--------------------------------------------------------------------------------------------+
最佳答案
为 sourceid
、 targetid
和 type
列创建索引,然后尝试使用以下查询:
SELECT
count(*)
FROM
`connections` AS `t0`
JOIN
`connections` AS `t1` ON `t1`.`targetid`=`t0`.`sourceid` and `t1`.`type` = 2
JOIN
`connections` AS `t2` ON `t2`.`targetid`=`t1`.`sourceid` and `t2`.dId = 1 AND `t2`.`sourceid` = 1 AND `t2`.`type` = 1
WHERE
`t0`.`type` = 3;
- - - -更新 - - -
我认为这些索引是正确的,使用这些大表可以达到最佳优化。我认为您无法通过其他优化(例如表分区/分片)来改进此查询。
如果这些数据不经常更改,或者我看到的唯一方法是垂直缩放,您可以实现某种缓存
关于mysql - 加快 MYSQL 中的自联接,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/19910020/