在数据库中,我有一个包含订单项的表。该表包含大约3600万条记录。

像这样运行查询大约需要3分钟:

SELECT COUNT(DISTINCT DATE(created_on), product_id) FROM order_items;


像这样运行查询大约需要13秒钟:

SELECT COUNT(1) FROM order_items;


某事告诉我,3600万条记录并不多,而且两个查询的运行速度都很慢。

在这里开始研究性能问题的清单将是什么?

我们正在使用MySQL(实际上是它的Clustrix版本MySQL 5.0.45-clustrix-6.0.1)。

编辑。添加更多信息:

/* SHOW CREATE TABLE order_items; */
CREATE TABLE `order_items` (
  `id` int(10) unsigned not null AUTO_INCREMENT,
  `state` enum('pending','sold_out','approved','declined','cancelled','processing','completed','expired') CHARACTER SET utf8 not null default 'pending',
  `order_id` int(10) unsigned not null,
  `product_id` int(10) unsigned not null,
  `quantity` smallint(5) unsigned not null,
  `price` decimal(10,2) unsigned not null,
  `total` decimal(10,2) unsigned not null,
  `created_on` datetime not null,
  `updated_on` datetime not null,
  `employee_id` int(11),
  `customer_id` int(11) unsigned not null,
  PRIMARY KEY (`id`) /*$ DISTRIBUTE=1 */,
  KEY `updated_on` (`updated_on`) /*$ DISTRIBUTE=1 */,
  KEY `state` (`state`,`quantity`) /*$ DISTRIBUTE=3 */,
  KEY `product_id` (`product_id`,`state`) /*$ DISTRIBUTE=2 */,
  KEY `product` (`product_id`) /*$ DISTRIBUTE=1 */,
  KEY `order_items_quantity` (`quantity`) /*$ DISTRIBUTE=2 */,
  KEY `order_id` (`order_id`,`state`,`created_on`) /*$ DISTRIBUTE=3 */,
  KEY `order` (`order_id`) /*$ DISTRIBUTE=1 */,
  KEY `index_order_items_on_employee_id` (`employee_id`) /*$ DISTRIBUTE=2 */,
  KEY `customer_id` (`customer_id`) /*$ DISTRIBUTE=2 */,
  KEY `created_at` (`created_on`) /*$ DISTRIBUTE=1 */,
) AUTO_INCREMENT=36943352 CHARACTER SET utf8 ENGINE=InnoDB /*$ REPLICAS=2 SLICES=12 */


和:

/* SHOW VARIABLES LIKE '%buffer%'; */
+----------------------------------------+-------+
| Variable_name                          | Value |
+----------------------------------------+-------+
| backup_compression_buffer_size_bytes   | 8192  |
| backup_read_buffer_size_bytes          | 8192  |
| backup_write_buffer_size_bytes         | 8192  |
| mysql_master_trx_buffer_kb             | 256   |
| mysql_slave_session_buffer_size_events | 100   |
| net_buffer_length                      | 16384 |
| replication_master_buffer_kb           | 65536 |
+----------------------------------------+-------+


编辑2。这是两个查询的EXPLAIN语句:

mysql> EXPLAIN SELECT COUNT(1) FROM order_items;
+----------------------------------------------------------+-------------+-------------+
| Operation                                                | Est. Cost   | Est. Rows   |
+----------------------------------------------------------+-------------+-------------+
| row_count "expr1"                                        | 29740566.81 |        1.00 |
|   stream_combine                                         | 26444732.70 | 32958341.10 |
|     compute expr0 := param(0)                            |  1929074.80 |  2746528.43 |
|       filter isnotnull(param(0))                         |  1915342.16 |  2746528.43 |
|         index_scan 1 := order_items.order_items_quantity |  1854308.19 |  3051698.25 |
+----------------------------------------------------------+-------------+-------------+
5 rows in set (0.13 sec)


和:

mysql> EXPLAIN SELECT COUNT(DISTINCT DATE(created_on), product_id) FROM order_items;
+----------------------------------------------------------------------------------+-------------+------------+
| Operation                                                                        | Est. Cost   | Est. Rows  |
+----------------------------------------------------------------------------------+-------------+------------+
| hash_aggregate_combine expr1 := count(DISTINCT (0 . "expr0"),(1 . "product_id")) | 10115923.36 | 4577547.38 |
|   hash_aggregate_partial GROUPBY((0 . "expr0"), (1 . "product_id"))              |  3707357.04 | 4577547.38 |
|     compute expr0 := cast(1.created_on, date)                                    |  2166388.20 | 3051698.25 |
|       index_scan 1 := order_items.__idx_order_items__PRIMARY                     |  2151129.71 | 3051698.25 |
+----------------------------------------------------------------------------------+-------------+------------+
4 rows in set (0.24 sec)

最佳答案

第一个查询必须遍历整个数据库,检查表中的每一行。在created_on和product_id上的索引可能会大大加快索引的速度。如果您不了解索引,http://use-the-index-luke.com是一个很好的起点。

在我看来,第二个查询应该是即时的,因为它只需要检查表元数据,而不需要检查任何行。

关于mysql - 数据库性能 list ,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/31329382/

10-13 08:49
查看更多