问题描述
我有一个表emp
,具有以下结构和数据:
I have a table emp
with following structure and data:
name dept salary
----- ----- -----
Jack a 2
Jill a 1
Tom b 2
Fred b 1
当我执行以下SQL时:
When I execute the following SQL:
SELECT * FROM emp GROUP BY dept
我得到以下结果:
name dept salary
----- ----- -----
Jill a 1
Fred b 1
服务器在什么基础上决定退回吉尔和弗雷德,并排除杰克和汤姆?
On what basis did the server decide return Jill and Fred and exclude Jack and Tom?
我正在MySQL中运行此查询.
I am running this query in MySQL.
注1:我知道查询本身没有任何意义.我正在尝试调试"GROUP BY"方案的问题.我试图了解用于此目的的默认行为.
Note 1: I know the query doesn't make sense on its own. I am trying to debug a problem with a 'GROUP BY' scenario. I am trying to understand the default behavior for this purpose.
注2:我习惯于编写与GROUP BY子句相同的SELECT子句(减去聚合字段).当我遇到上述行为时,我开始怀疑我是否可以在以下情况下依靠它: 从emp表中选择薪水在部门中最低/最高的行.例如:这样的SQL语句可在MySQL上运行:
Note 2: I am used to writing the SELECT clause same as the GROUP BY clause (minus the aggregate fields). When I came across the behavior described above, I started wondering if I can rely on this for scenarios such as: select the rows from emp table where the salary is the lowest/highest in the dept.E.g.: The SQL statements like this works on MySQL:
SELECT A.*, MIN(A.salary) AS min_salary FROM emp AS A GROUP BY A.dept
我没有发现任何材料描述这种SQL为何起作用,更重要的是,如果我能够始终如一地依靠这种行为的话.如果这是可靠的行为,那么我可以避免这样的查询:
I didn't find any material describing why such SQL works, more importantly if I can rely on such behavior consistently. If this is a reliable behavior then I can avoid queries like:
SELECT A.* FROM emp AS A WHERE A.salary = (
SELECT MAX(B.salary) FROM emp B WHERE B.dept = A.dept)
推荐答案
阅读关于这一点的 MySQL文档 .
简而言之,出于性能考虑,MySQL允许从GROUP BY中删除某些列,但是仅当被省略的列都具有相同的值时,这才有效 (在内部分组),否则,查询返回的值确实是不确定的,正如本文中其他人所猜测的那样.确保添加ORDER BY子句不会重新引入任何形式的确定性行为.
In a nutshell, MySQL allows omitting some columns from the GROUP BY, for performance purposes, however this works only if the omitted columns all have the same value (within a grouping), otherwise, the value returned by the query are indeed indeterminate, as properly guessed by others in this post. To be sure adding an ORDER BY clause would not re-introduce any form of deterministic behavior.
尽管不是问题的核心,但本示例显示了如何使用*而不是对所需列进行显式枚举通常是一个坏主意.
Although not at the core of the issue, this example shows how using * rather than an explicit enumeration of desired columns is often a bad idea.
MySQL 5.0文档摘录:
Excerpt from MySQL 5.0 documentation:
When using this feature, all rows in each group should have the same values
for the columns that are omitted from the GROUP BY part. The server is free
to return any value from the group, so the results are indeterminate unless
all values are the same.
这篇关于SELECT子句中不存在聚合函数时的GROUP BY行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!