问题描述
MODE是数据中出现MOST次的值,可以是ONE MODE或MANY MODES
MODE is the value that occurs the MOST times in the data, there can be ONE MODE or MANY MODES
这是两个表中的一些值( sqlFiddle )
here's some values in two tables (sqlFiddle)
create table t100(id int auto_increment primary key, value int);
create table t200(id int auto_increment primary key, value int);
insert into t100(value) values (1),
(2),(2),(2),
(3),(3),
(4);
insert into t200(value) values (1),
(2),(2),(2),
(3),(3),
(4),(4),(4);
现在,要获取以逗号分隔列表形式返回的MODE,我对表t100
right now, to get the MODE(S) returned as comma separated list, I run the below query for table t100
SELECT GROUP_CONCAT(value) as modes,occurs
FROM
(SELECT value,occurs FROM
(SELECT value,count(*) as occurs
FROM
T100
GROUP BY value)T1,
(SELECT max(occurs) as maxoccurs FROM
(SELECT value,count(*) as occurs
FROM
T100
GROUP BY value)T2
)T3
WHERE T1.occurs = T3.maxoccurs)T4
GROUP BY occurs;
和下面对表t200
的查询(相同的查询,只是更改了表名)在此示例中,我有2个表,因为它表明它适用于具有1个MODE且具有多个MODES的情况.
and the below query for table t200
(same query just with table name changed) I have 2 tables in this example because to show that it works for cases where there's 1 MODE and where there are multiple MODES.
SELECT GROUP_CONCAT(value) as modes,occurs
FROM
(SELECT value,occurs FROM
(SELECT value,count(*) as occurs
FROM
T200
GROUP BY value)T1,
(SELECT max(occurs) as maxoccurs FROM
(SELECT value,count(*) as occurs
FROM
T200
GROUP BY value)T2
)T3
WHERE T1.occurs = T3.maxoccurs)T4
GROUP BY occurs;
我的问题是有没有更简单的方法?"
My question is "Is there a simpler way?"
我当时想使用HAVING count(*) = max(count(*))
或类似的方法来摆脱多余的联接,但无法获得HAVING
来返回我想要的结果.
I was thinking like using HAVING count(*) = max(count(*))
or something similar to get rid of the extra join but couldn't get HAVING
to return the result i wanted.
已更新:按照@zneak的建议,我可以像下面这样简化T3
:
UPDATED:as suggested by @zneak, I can simplify T3
like below:
SELECT GROUP_CONCAT(value) as modes,occurs
FROM
(SELECT value,occurs FROM
(SELECT value,count(*) as occurs
FROM
T200
GROUP BY value)T1,
(SELECT count(*) as maxoccurs
FROM
T200
GROUP BY value
ORDER BY count(*) DESC
LIMIT 1
)T3
WHERE T1.occurs = T3.maxoccurs)T4
GROUP BY occurs;
现在是否有办法完全搭乘T3?我尝试了此操作,但由于某种原因它不返回任何行
Now is there a way to get ride of T3 altogether?I tried this but it returns no rows for some reason
SELECT value,occurs FROM
(SELECT value,count(*) as occurs
FROM t200
GROUP BY `value`)T1
HAVING occurs=max(occurs)
基本上,我想知道是否有一种方法可以只指定一次t100
或t200
.
basically I am wondering if there's a way to do it such that I only need to specify t100
or t200
once.
已更新:我发现通过添加一个变量来设置我自己的maxoccurs只能指定一次t100
或t200
的方法,如下所示:
UPDATED: i found a way to specify t100
or t200
only once by adding a variable to set my own maxoccurs like below
SELECT GROUP_CONCAT(CASE WHEN occurs=@maxoccurs THEN value ELSE NULL END) as modes
FROM
(SELECT value,occurs,@maxoccurs:=GREATEST(@maxoccurs,occurs) as maxoccurs
FROM (SELECT value,count(*) as occurs
FROM t200
GROUP BY `value`)T1,(SELECT @maxoccurs:=0)mo
)T2
推荐答案
与上一个查询非常接近.以下内容可找到一个模式:
You are very close with the last query. The following finds one mode:
SELECT value, occurs
FROM (SELECT value,count(*) as occurs
FROM t200
GROUP BY `value`
LIMIT 1
) T1
不过,我认为您的问题是关于多种模式的:
I think your question was about multiple modes, though:
SELECT value, occurs
FROM (SELECT value, count(*) as occurs
FROM t200
GROUP BY `value`
) T1
WHERE occurs = (select max(occurs)
from (select `value`, count(*) as occurs
from t200
group by `value`
) t
);
在几乎所有其他数据库中,这要容易得多. MySQL不支持with
也不支持窗口/分析功能.
This is much easier in almost any other database. MySQL supports neither with
nor window/analytic functions.
您的查询(如下所示)不执行您认为的操作:
Your query (shown below) does not do what you think it is doing:
SELECT value, occurs
FROM (SELECT value, count(*) as occurs
FROM t200
GROUP BY `value`
) T1
HAVING occurs = max(occurs) ;
最后的having
子句引用变量occurs
,但确实使用了max(occurs)
.由于使用了max(occurs)
,因此这是一个聚合查询,该查询返回一行,汇总了子查询中的所有行.
The final having
clause refers to the variable occurs
but does use max(occurs)
. Because of the use of max(occurs)
this is an aggregation query that returns one row, summarizing all rows from the subquery.
变量occurs
未用于分组.那么,MySQL使用什么价值呢?它使用子查询中任一行的任意值.此任意值可能匹配,也可能不匹配.但是,该值仅来自一行.没有迭代.
The variable occurs
is not using for grouping. So, what value does MySQL use? It uses an arbitrary value from one of the rows in the subquery. This arbitrary value might match, or it might not. But, the value only comes from one row. There is no iteration over it.
这篇关于有没有一种简单的方法可以在MySQL中查找某些值的MODE(S)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!