问题描述
我写了一个自定义的服务台来工作,直到最近,它一直运转良好.一个查询的确确实放慢了速度.现在大约需要14秒!以下是相关表格:
I wrote a custom help desk for work and it's been running great... until recently. One query has really slowed down. It takes about 14 seconds now! Here are the relevant tables:
CREATE TABLE `tickets` (
`id` int(11) unsigned NOT NULL DEFAULT '0',
`date_submitted` datetime DEFAULT NULL,
`date_closed` datetime DEFAULT NULL,
`first_name` varchar(50) DEFAULT NULL,
`last_name` varchar(50) DEFAULT NULL,
`email` varchar(50) DEFAULT NULL,
`description` text,
`agent_id` smallint(5) unsigned NOT NULL DEFAULT '1',
`status` smallint(5) unsigned NOT NULL DEFAULT '1',
`priority` tinyint(4) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY `date_closed` (`date_closed`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `solutions` (
`id` int(10) unsigned NOT NULL,
`ticket_id` mediumint(8) unsigned DEFAULT NULL,
`date` datetime DEFAULT NULL,
`hours_spent` float DEFAULT NULL,
`agent_id` smallint(5) unsigned DEFAULT NULL,
`body` text,
PRIMARY KEY (`id`),
KEY `ticket_id` (`ticket_id`),
KEY `date` (`date`),
KEY `hours_spent` (`hours_spent`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
当用户提交票证时,它将进入票证"表.然后,当业务代表解决问题时,他们会记录他们采取的措施.每个条目都进入解决方案"表.换句话说,票证有很多解决方案.
When a user submits a ticket, it goes into the "tickets" table. Then, as the agents work through the problem, they record the actions they took. Each entry goes into the "solutions" table. In other words, tickets have many solutions.
已降低速度的查询的目标是从票证"表中提取所有字段,并从解决方案"表中提取最新的条目.这是我一直在使用的查询:
The goal of the query that has slowed down is to pull all the fields from the "tickets" table and also the latest entry from the "solutions" table. This is the query I've been using:
SELECT tickets.*,
(SELECT CONCAT_WS(" * ", DATE_FORMAT(solutions.date, "%c/%e/%y"), solutions.hours_spent, CONCAT_WS(": ", solutions.agent_id, solutions.body))
FROM solutions
WHERE solutions.ticket_id = tickets.id
ORDER BY solutions.date DESC, solutions.id DESC
LIMIT 1
) AS latest_solution_entry
FROM tickets
WHERE tickets.date_closed IS NULL
OR tickets.date_closed >= '2012-06-20 00:00:00'
ORDER BY tickets.id DESC
以下是"latest_solution_entry"字段的示例:
Here is an example of what the "latest_solution_entry" field looks like:
6/20/12 * 1337 * 1: I restarted the computer and that fixed the problem. Yes, I took an hour to do this.
在PHP中,我拆分了"latest_solution_entry"字段并正确设置了格式.
In PHP, I split up the "latest_solution_entry" field and format it correctly.
当我注意到运行查询的页面放慢了 way 的速度时,我在没有子查询的情况下运行了该查询,并且运行速度非常快.然后,我在原始查询上运行了EXPLAIN
并得到了:
When I noticed that the page that runs the query had slowed way down, I ran the query without the subquery and it was super fast. I then ran an EXPLAIN
on the original query and got this:
+----+--------------------+-----------+-------+---------------+-----------+---------+---------------------+-------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-----------+-------+---------------+-----------+---------+---------------------+-------+-----------------------------+
| 1 | PRIMARY | tickets | index | date_closed | PRIMARY | 4 | NULL | 35804 | Using where |
| 2 | DEPENDENT SUBQUERY | solutions | ref | ticket_id | ticket_id | 4 | helpdesk.tickets.id | 1 | Using where; Using filesort |
+----+--------------------+-----------+-------+---------------+-----------+---------+---------------------+-------+-----------------------------+
因此,我正在寻找一种提高查询效率的方法,但仍可以实现相同的目标.有什么想法吗?
So I'm looking for a way to make my query more efficient, yet still achieve the same goal. Any ideas?
推荐答案
让我总结一下我的理解:您想选择每张票证及其最后的解决方案.
Let me sum up what I understood: you'd like to select each ticket and its last solution.
我喜欢对以下类型的问题使用以下模式,因为它避免了子查询模式,因此在需要性能的地方非常好.缺点是要理解它有点棘手:
I like using the following pattern for this kind of question as it avoids the subquery pattern and is therefore rather good where performance is needed. The drawback is that it is a bit tricky to understand:
SELECT
t.*,
s1.*
FROM tickets t
INNER JOIN solutions s1 ON t.id = s1.ticket_id
LEFT JOIN solutions s2 ON s1.ticket_id = s2.ticket_id AND s2.id > s1.id
WHERE s2.id IS NULL;
为了更好地理解,我只写了模式的心脏.
I wrote only the heart of the pattern for a better understanding.
关键是:
-
solutions
表的左联接及其自身具有s1.ticket_id = s2.ticket_id
条件:它模拟GROUP BY ticket_id
.
the LEFT JOIN of the
solutions
table with itself with thes1.ticket_id = s2.ticket_id
condition: it emulates theGROUP BY ticket_id
.
条件s2.id > s1.id
:它是我只想要最后一个解决方案"的SQL,它模拟了MAX()
.我假设在您的模型中,the last
表示with the greatest id
,但是您可以在此处使用日期条件.请注意,s2.id < s1.id
将为您提供第一个解决方案.
the condition s2.id > s1.id
: it is the SQL for "I only want the last solution", it emulates the MAX()
. I assumed that in your model, the last
means with the greatest id
but you could use here a condition on the date. Note that s2.id < s1.id
would give you the first solution.
WHERE子句s2.id IS NULL
:最奇怪但绝对必要的……仅保留您想要的记录.
the WHERE clause s2.id IS NULL
: the weirdest one but absolutely necessary... keeps only the records you want.
尝试一下,让我知道:)
Have a try and let me know :)
我刚刚意识到第二点假设过分简化了问题.这使它变得更加有趣:p我正在尝试查看此模式如何与您的date, id
排序一起使用.
Edit 1: I just realised that the second point assumption was oversimplifying the problem. That makes it even more interesting :p I'm trying to see how this pattern may work with your date, id
ordering.
好的,只要稍作改动,效果就很好. LEFT JOIN上的条件变为:
Edit 2: Ok, it works great with a little twist. The condition on the LEFT JOIN becomes:
LEFT JOIN solutions s2 ON s1.ticket_id = s2.ticket_id
AND (s2.date > s1.date OR (s2.date = s1.date AND s2.id > s1.id))
这篇关于MySQL:我可以做一个左联接,只从联接表中拉一行吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!