问题描述
这是给你的难题:
我将群集计算的统计信息保存在名为"jobs"的MySQL表中.每个作业行都有一个执行作业的主机(不是唯一的),一个作业执行时间(以秒为单位)和一个唯一的整数作为PK,因此我只需订购PK就可以订购已完成的作业.
I'm keeping stats of cluster computing stuff in a MySQL table named 'jobs'. Each job row has a host the job executed on (not unique), a job execution time in seconds, and a unique integer as the PK so I can order the completed jobs simply by ordering the PK.
截至目前,使用平均值和分组依据,我可以找到所有已完成作业中每个主机的平均执行时间(以秒为单位).我不想平均每个主机的所有执行时间,而是想要每个主机最近五个作业的平均时间.
As of right now, using average and group by, I can find the average execution time in seconds for each host over all of the jobs completed. Instead of averaging all the execution times per host, I want the average time of the last five jobs per host.
关于操作和分组依据的例子很多,有限制操作的例子很多,但是有什么方法可以在一个非常简单的MySQL查询中将两者结合起来?
There's all sorts of examples for operations and group by, and lots of examples for operations with limit, but is there any way of combining the two in a fairly straightforward MySQL query?
如果我不清楚,我希望主机1的平均执行时间为5,主机2的平均执行时间为5,等等.
In the event that I'm not clear about it, I want the average five execution times for host 1, and the average five execution times for host 2, etc.
推荐答案
我最初的反应是使用LIMIT将平均值限制为5个结果,这使我建议:
My initial reaction was to use LIMIT to restrict the average to 5 results, which led me to suggest:
select a.host, avg(a.execution_time) from (select id, execution_time, host from jobs order by id desc limit 5) a group by a.host;
但是很明显,这将平均数限制为每个主机最近5个作业,而不是最近5个作业.
But it is clear that this limits the average to the most recent 5 jobs, and not the most recent 5 jobs per host.
在不使用某种存储过程的情况下,使用LIMIT限制平均值似乎很困难.这导致我考虑使用mysql变量为每个作业分配每个主机的完成顺序或职位.
It seems difficult to use LIMIT to restrict the average, without using some kind of stored procedure. This led me to consider assigning each job a per-host completion order, or position, using a mysql variable.
这未经测试,但是它说明的理论应该是一个很好的起点:
This is untested, but the theory it illustrates should be a good starting point:
首先,我们应该根据作业的主机分配一个职位:
First, we should assign each job a position based on its host:
select
host,
execution_time,
@current_pos := if (@current_host = host, @current_pos, 0) + 1 as position,
@current_host := host
from
(select @current_host := null, @current_pos := 0) set_pos,
jobs
order by
host,
id desc;
在建立位置之后,只需选择聚合函数,将结果限制在前5个位置:
After establishing the position, just select the aggregate function, restricting results to the top 5 positions:
select
jt.host,
avg(jt.execution_time)
from
(
select
host,
execution_time,
@current_pos := if (@current_host = host, @current_pos, 0) + 1 as position,
@current_host := host
from
(select @current_host := null, @current_pos := 0) set_pos,
jobs
order by
host,
id desc
) jt
where
jt.position <= 5
group
by host;
请让我知道这是否适合您,或者还有更多我未考虑的方面.这是一个有趣的问题.
Please let me know if this works for you, or if there are more aspects I have not considered. This is an intriguing problem.
这篇关于MySQL限制,组和AVG查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!