目录
一、CUME_DIST 取累积分布
MySQL CUME_DIST() 函数返回当前行的累积分布,即从第一行到与当前行值相同的最后一行的行数在分区内的总行数中的占比。
语法:
CUME_DIST()
OVER (
[PARTITION BY partition_column_list]
[ORDER BY order_column_list]
)
参数说明:
partition_column_list
:参与分区的列的列表。partition_column_list
:参与排序的列的列表。
返回值:
- 一个大于 0 并且小于等于 1 的值,它是当前行的累积分布。
- 计算公式为:(当前行之前的行数 + 与当前行值相同的行数) / 分区内的总行数。
示例:
mysql> select * from grade;
+----+------+-------+---------+-------+
| id | name | class | subject | grade |
+----+------+-------+---------+-------+
| 1 | Tim | A | Math | 9 |
| 2 | Tom | A | Math | 7 |
| 3 | Jim | A | Math | 8 |
| 4 | Tim | A | English | 7 |
| 5 | Tom | A | English | 8 |
| 6 | Jim | A | English | 7 |
| 7 | Lucy | B | Math | 8 |
| 8 | Jody | B | Math | 6 |
| 9 | Susy | B | Math | 9 |
| 10 | Lucy | B | English | 6 |
| 11 | Jody | B | English | 7 |
| 12 | Susy | B | English | 8 |
+----+------+-------+---------+-------+
mysql> SELECT *,
CUME_DIST() OVER (
PARTITION BY subject
ORDER BY grade
) "cume_dist"
FROM grade;
+----+------+-------+---------+-------+---------------------+
| id | name | class | subject | grade | cume_dist |
+----+------+-------+---------+-------+---------------------+
| 10 | Lucy | B | English | 6 | 0.16666666666666666 |
| 4 | Tim | A | English | 7 | 0.6666666666666666 |
| 6 | Jim | A | English | 7 | 0.6666666666666666 |
| 11 | Jody | B | English | 7 | 0.6666666666666666 |
| 5 | Tom | A | English | 8 | 1 |
| 12 | Susy | B | English | 8 | 1 |
| 8 | Jody | B | Math | 6 | 0.16666666666666666 |
| 2 | Tom | A | Math | 7 | 0.3333333333333333 |
| 3 | Jim | A | Math | 8 | 0.6666666666666666 |
| 7 | Lucy | B | Math | 8 | 0.6666666666666666 |
| 1 | Tim | A | Math | 9 | 1 |
| 9 | Susy | B | Math | 9 | 1 |
+----+------+-------+---------+-------+---------------------+
注意,在每个科目内,成绩相同的行具有相同的累积分布。
二、DENSE_RANK 分区内的排名
MySQL DENSE_RANK() 函数返回当前行所在的分区内的排名,从 1 开始,但没有间隔。
语法:
DENSE_RANK()
OVER (
[PARTITION BY partition_column_list]
[ORDER BY order_column_list]
)
参数说明:
- partition_column_list:参与分区的列的列表。
- order_column_list:参与排序的列的列表。
返回值:
- 当前行所在的分区内的排名,从 1 开始,但没有间隔。
示例:
mysql> select * from grade;
+----+------+-------+---------+-------+
| id | name | class | subject | grade |
+----+------+-------+---------+-------+
| 1 | Tim | A | Math | 9 |
| 2 | Tom | A | Math | 7 |
| 3 | Jim | A | Math | 8 |
| 4 | Tim | A | English | 7 |
| 5 | Tom | A | English | 8 |
| 6 | Jim | A | English | 7 |
| 7 | Lucy | B | Math | 8 |
| 8 | Jody | B | Math | 6 |
| 9 | Susy | B | Math | 9 |
| 10 | Lucy | B | English | 6 |
| 11 | Jody | B | English | 7 |
| 12 | Susy | B | English | 8 |
+----+------+-------+---------+-------+
- 按照科目查看每个学生的成绩排名
mysql> SELECT *,
DENSE_RANK() OVER (
PARTITION BY subject
ORDER BY grade DESC
) "dense_rank",
rank() OVER (
PARTITION BY subject
ORDER BY grade DESC
) "rank"
FROM grade;
+----+------+-------+---------+-------+------------+------+
| id | name | class | subject | grade | dense_rank | rank |
+----+------+-------+---------+-------+------------+------+
| 5 | Tom | A | English | 8 | 1 | 1 |
| 12 | Susy | B | English | 8 | 1 | 1 |
| 4 | Tim | A | English | 7 | 2 | 3 |
| 6 | Jim | A | English | 7 | 2 | 3 |
| 11 | Jody | B | English | 7 | 2 | 3 |
| 10 | Lucy | B | English | 6 | 3 | 6 |
| 1 | Tim | A | Math | 9 | 1 | 1 |
| 9 | Susy | B | Math | 9 | 1 | 1 |
| 3 | Jim | A | Math | 8 | 2 | 3 |
| 7 | Lucy | B | Math | 8 | 2 | 3 |
| 2 | Tom | A | Math | 7 | 3 | 5 |
| 8 | Jody | B | Math | 6 | 4 | 6 |
+----+------+-------+---------+-------+------------+------+
注意,在每个科目内,成绩相同的行具有相同的累积分布。
-
按照班级查看每个学生的总成绩排名,要查看在每个班级中每个学生按照总成绩从高到底的排名。
mysql> SELECT
t.*,
DENSE_RANK() OVER (
PARTITION BY class
ORDER BY t.sum_grade DESC
) "dense_rank"
FROM (
SELECT class,
name,
sum(grade) sum_grade
FROM grade
GROUP BY class, name
) t;
+-------+------+-----------+------------+
| class | name | sum_grade | dense_rank |
+-------+------+-----------+------------+
| A | Tim | 16 | 1 |
| A | Tom | 15 | 2 |
| A | Jim | 15 | 2 |
| B | Susy | 17 | 1 |
| B | Lucy | 14 | 2 |
| B | Jody | 13 | 3 |
+-------+------+-----------+------------+
- 其中子查询
mysql> SELECT class,
name,
sum(grade) sum_grade
FROM grade
GROUP BY class, name;
+-------+------+-----------+
| class | name | sum_grade |
+-------+------+-----------+
| A | Tim | 16 |
| A | Tom | 15 |
| A | Jim | 15 |
| B | Lucy | 14 |
| B | Jody | 13 |
| B | Susy | 17 |
+-------+------+-----------+
主语句在从这个子查询中按照班级进行分区,然后在每个分区内按照总成绩逆序排序,并使用 DENSE_RANK()
返回每行在其关联的分区内的排名。
三、FIRST_VALUE 评估值(第一行)
MySQL FIRST_VALUE() 函数从当前行关联的窗口框架的第一行中返回评估的值。
语法:
FIRST_VALUE(expr)
OVER (
[PARTITION BY partition_column_list]
[ORDER BY order_column_list]
)
参数说明:
expr
:必需的。它可以是一个列名或者表达式。partition_column_list
:参与分区的列的列表。order_column_list
:参与排序的列的列表。
返回值:
- 当前行关联的窗口框架的第一行中返回评估的值。
示例:
mysql> SELECT * FROM grade;
+----+------+-------+---------+-------+
| id | name | class | subject | grade |
+----+------+-------+---------+-------+
| 1 | Tim | A | Math | 9 |
| 2 | Tom | A | Math | 7 |
| 3 | Jim | A | Math | 8 |
| 4 | Tim | A | English | 7 |
| 5 | Tom | A | English | 8 |
| 6 | Jim | A | English | 7 |
| 7 | Lucy | B | Math | 8 |
| 8 | Jody | B | Math | 6 |
| 9 | Susy | B | Math | 9 |
| 10 | Lucy | B | English | 6 |
| 11 | Jody | B | English | 7 |
| 12 | Susy | B | English | 8 |
+----+------+-------+---------+-------+
- 按照科目查看每个学生的成绩排名,要查看在每个科目中每个学生按照成绩从高到低的排序号和每个科目中的最好成绩。
mysql> SELECT
*,
FIRST_VALUE(grade) OVER (
PARTITION BY subject
ORDER BY grade DESC
RANGE BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) first_grade
FROM grade;
+----+------+-------+---------+-------+-------------+
| id | name | class | subject | grade | first_grade |
+----+------+-------+---------+-------+-------------+
| 5 | Tom | A | English | 8 | 8 |
| 12 | Susy | B | English | 8 | 8 |
| 4 | Tim | A | English | 7 | 8 |
| 6 | Jim | A | English | 7 | 8 |
| 11 | Jody | B | English | 7 | 8 |
| 10 | Lucy | B | English | 6 | 8 |
| 1 | Tim | A | Math | 9 | 9 |
| 9 | Susy | B | Math | 9 | 9 |
| 3 | Jim | A | Math | 8 | 9 |
| 7 | Lucy | B | Math | 8 | 9 |
| 2 | Tom | A | Math | 7 | 9 |
| 8 | Jody | B | Math | 6 | 9 |
+----+------+-------+---------+-------+-------------+
- 窗口函数:
FIRST_VALUE(grade) OVER (
PARTITION BY subject
ORDER BY grade DESC
RANGE BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)
- 按照班级查看每个学生的总成绩排名,要查看在每个班级中每个学生按照总成绩从高到底的排序号和每个班级的最好成绩。
mysql> SELECT
t.*,
FIRST_VALUE(t.sum_grade) OVER (
PARTITION BY class
ORDER BY t.sum_grade DESC
RANGE BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) first_grade
FROM (
SELECT class,
name,
sum(grade) sum_grade
FROM grade
GROUP BY class, name
) t;
+-------+------+-----------+-------------+
| class | name | sum_grade | first_grade |
+-------+------+-----------+-------------+
| A | Tim | 16 | 16 |
| A | Tom | 15 | 16 |
| A | Jim | 15 | 16 |
| B | Susy | 17 | 17 |
| B | Lucy | 14 | 17 |
| B | Jody | 13 | 17 |
+-------+------+-----------+-------------+
- 子查询使用 GROUP BY 子句和 sum() 按照班级和学生汇总出每个学生的总成绩。
mysql> SELECT class,
name,
sum(grade) sum_grade
FROM grade
GROUP BY class, name;
+-------+------+-----------+
| class | name | sum_grade |
+-------+------+-----------+
| A | Tim | 16 |
| A | Tom | 15 |
| A | Jim | 15 |
| B | Lucy | 14 |
| B | Jody | 13 |
| B | Susy | 17 |
+-------+------+-----------+
主语句在从这个子查询中按照班级进行分区,然后在每个分区内按照总成绩逆序排序,并使用 FIRST_VALUE(grade)
返回每行在其关联的分区内第一行中的成绩。
四、LAST_VALUE 评估值(最后行)
MySQL LAST_VALUE() 函数从当前行关联的窗口框架的最后一行中返回评估的值。
语法:
LAST_VALUE(expr)
OVER (
[PARTITION BY partition_column_list]
[ORDER BY order_column_list]
)
参数说明:
expr
:必需的。它可以是一个列名或者表达式。partition_column_list
:参与分区的列的列表。order_column_list
:参与排序的列的列表。
返回值:
- 从当前行关联的窗口框架的最后一行中返回评估的值。
示例:
mysql> SELECT * FROM grade;
+----+------+-------+---------+-------+
| id | name | class | subject | grade |
+----+------+-------+---------+-------+
| 1 | Tim | A | Math | 9 |
| 2 | Tom | A | Math | 7 |
| 3 | Jim | A | Math | 8 |
| 4 | Tim | A | English | 7 |
| 5 | Tom | A | English | 8 |
| 6 | Jim | A | English | 7 |
| 7 | Lucy | B | Math | 8 |
| 8 | Jody | B | Math | 6 |
| 9 | Susy | B | Math | 9 |
| 10 | Lucy | B | English | 6 |
| 11 | Jody | B | English | 7 |
| 12 | Susy | B | English | 8 |
+----+------+-------+---------+-------+
- 按照科目查看每个学生的成绩排名,要查看在每个科目中每个学生按照成绩从高到低的排序号和每个科目中的最好成绩。
mysql> SELECT
*,
LAST_VALUE(grade) OVER (
PARTITION BY subject
ORDER BY grade DESC
RANGE BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) last_grade
FROM grade;
+----+------+-------+---------+-------+------------+
| id | name | class | subject | grade | last_grade |
+----+------+-------+---------+-------+------------+
| 5 | Tom | A | English | 8 | 6 |
| 12 | Susy | B | English | 8 | 6 |
| 4 | Tim | A | English | 7 | 6 |
| 6 | Jim | A | English | 7 | 6 |
| 11 | Jody | B | English | 7 | 6 |
| 10 | Lucy | B | English | 6 | 6 |
| 1 | Tim | A | Math | 9 | 6 |
| 9 | Susy | B | Math | 9 | 6 |
| 3 | Jim | A | Math | 8 | 6 |
| 7 | Lucy | B | Math | 8 | 6 |
| 2 | Tom | A | Math | 7 | 6 |
| 8 | Jody | B | Math | 6 | 6 |
+----+------+-------+---------+-------+------------+
- 窗口函数:
LAST_VALUE(grade) OVER (
PARTITION BY subject
ORDER BY grade DESC
RANGE BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)
- 按照班级查看每个学生的总成绩排名,要查看在每个班级中每个学生按照总成绩从高到底的排序号和每个班级的最好成绩。
mysql> SELECT
t.*,
LAST_VALUE(t.sum_grade) OVER (
PARTITION BY class
ORDER BY t.sum_grade DESC
RANGE BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) last_grade
FROM (
SELECT class,
name,
sum(grade) sum_grade
FROM grade
GROUP BY class, name
) t;
+-------+------+-----------+------------+
| class | name | sum_grade | last_grade |
+-------+------+-----------+------------+
| A | Tim | 16 | 15 |
| A | Tom | 15 | 15 |
| A | Jim | 15 | 15 |
| B | Susy | 17 | 13 |
| B | Lucy | 14 | 13 |
| B | Jody | 13 | 13 |
+-------+------+-----------+------------+
- 子查询使用 GROUP BY 子句和 sum() 按照班级和学生汇总出每个学生的总成绩。
mysql> SELECT class,
name,
sum(grade) sum_grade
FROM grade
GROUP BY class, name;
+-------+------+-----------+
| class | name | sum_grade |
+-------+------+-----------+
| A | Tim | 16 |
| A | Tom | 15 |
| A | Jim | 15 |
| B | Lucy | 14 |
| B | Jody | 13 |
| B | Susy | 17 |
+-------+------+-----------+
主语句在从这个子查询中按照班级进行分区,然后在每个分区内按照总成绩逆序排序,并使用 LAST_VALUE(grade) 返回每行在其关联的分区内最后一行中的成绩。
五、NTH_VALUE 评估值(指定行)
MySQL NTH_VALUE() 函数从当前行关联的窗口框架的指定的一行中返回评估的值
语法:
NTH_VALUE(expr, n)
OVER (
[PARTITION BY partition_column_list]
[ORDER BY order_column_list]
)
NTH_VALUE(expr, 1) 等效于 first_value(expr)。
参数说明:
expr:
必需的。它可以是一个列名或者表达式。n
:必需的。指定行的编号。partition_column_list
:参与分区的列的列表。order_column_list
:参与排序的列的列表。
返回值:
- 从当前行关联的窗口框架的指定的一行中返回评估的值。
- 返回
NULL情况:
指定的行不存在。
示例:
mysql> SELECT * FROM grade;
+----+------+-------+---------+-------+
| id | name | class | subject | grade |
+----+------+-------+---------+-------+
| 1 | Tim | A | Math | 9 |
| 2 | Tom | A | Math | 7 |
| 3 | Jim | A | Math | 8 |
| 4 | Tim | A | English | 7 |
| 5 | Tom | A | English | 8 |
| 6 | Jim | A | English | 7 |
| 7 | Lucy | B | Math | 8 |
| 8 | Jody | B | Math | 6 |
| 9 | Susy | B | Math | 9 |
| 10 | Lucy | B | English | 6 |
| 11 | Jody | B | English | 7 |
| 12 | Susy | B | English | 8 |
+----+------+-------+---------+-------+
- 按照科目查看每个学生的成绩排名,要查看在每个科目中每个学生按照成绩从高到低的排序号和每个科目中的最好成绩。
mysql> SELECT
*,
NTH_VALUE(grade, 1) OVER (
PARTITION BY subject
ORDER BY grade DESC
RANGE BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) first_grade
FROM grade;
+----+------+-------+---------+-------+-------------+
| id | name | class | subject | grade | first_grade |
+----+------+-------+---------+-------+-------------+
| 5 | Tom | A | English | 8 | 8 |
| 12 | Susy | B | English | 8 | 8 |
| 4 | Tim | A | English | 7 | 8 |
| 6 | Jim | A | English | 7 | 8 |
| 11 | Jody | B | English | 7 | 8 |
| 10 | Lucy | B | English | 6 | 8 |
| 1 | Tim | A | Math | 9 | 9 |
| 9 | Susy | B | Math | 9 | 9 |
| 3 | Jim | A | Math | 8 | 9 |
| 7 | Lucy | B | Math | 8 | 9 |
| 2 | Tom | A | Math | 7 | 9 |
| 8 | Jody | B | Math | 6 | 9 |
+----+------+-------+---------+-------+-------------+
- 窗口函数:
NTH_VALUE(grade, 1) OVER (
PARTITION BY subject
ORDER BY grade DESC
RANGE BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)
如果想在每行中显示每个科目中最差的成绩,可以使用 NTH_VALUE(grade, 3)
替换上面语句中的 NTH_VALUE(grade, 1)
。
- 按照班级查看每个学生的总成绩排名,要查看在每个班级中每个学生按照总成绩从高到底的排序号和每个班级的最好成绩。
mysql> SELECT
t.*,
NTH_VALUE(t.sum_grade, 1) OVER (
PARTITION BY class
ORDER BY t.sum_grade DESC
RANGE BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) first_grade
FROM (
SELECT class,
name,
sum(grade) sum_grade
FROM grade
GROUP BY class, name
) t;
+-------+------+-----------+-------------+
| class | name | sum_grade | first_grade |
+-------+------+-----------+-------------+
| A | Tim | 16 | 16 |
| A | Tom | 15 | 16 |
| A | Jim | 15 | 16 |
| B | Susy | 17 | 17 |
| B | Lucy | 14 | 17 |
| B | Jody | 13 | 17 |
+-------+------+-----------+-------------+
- 子查询使用 GROUP BY 子句和 sum() 按照班级和学生汇总出每个学生的总成绩。
mysql> SELECT class,
name,
sum(grade) sum_grade
FROM grade
GROUP BY class, name;
+-------+------+-----------+
| class | name | sum_grade |
+-------+------+-----------+
| A | Tim | 16 |
| A | Tom | 15 |
| A | Jim | 15 |
| B | Lucy | 14 |
| B | Jody | 13 |
| B | Susy | 17 |
+-------+------+-----------+
主语句在从这个子查询中按照班级进行分区,然后在每个分区内按照总成绩逆序排序,并使用 NTH_VALUE(t.sum_grade, 1)
返回每行在其关联的分区内第一行中的成绩。
六、PERCENT_RANK 取分区相对排名
MySQL PERCENT_RANK() 函数返回当前行所在的分区内的相对排名,也就是 (rank() - 1) / (分区总行数 - 1)
语法:
PERCENT_RANK()
OVER (
[PARTITION BY partition_column_list]
[ORDER BY order_column_list]
)
参数说明:
partition_column_list
:参与分区的列的列表。order_column_list
:参与排序的列的列表。
返回值:
- 一个介于 0 和 1 (包含) 之间的数,表示当前行所在的分区内的相对排名,其计算公式为是
(rank() - 1) / (分区总行数 - 1)
。
示例:
mysql> SELECT * FROM grade;
+----+------+-------+---------+-------+
| id | name | class | subject | grade |
+----+------+-------+---------+-------+
| 1 | Tim | A | Math | 9 |
| 2 | Tom | A | Math | 7 |
| 3 | Jim | A | Math | 8 |
| 4 | Tim | A | English | 7 |
| 5 | Tom | A | English | 8 |
| 6 | Jim | A | English | 7 |
| 7 | Lucy | B | Math | 8 |
| 8 | Jody | B | Math | 6 |
| 9 | Susy | B | Math | 9 |
| 10 | Lucy | B | English | 6 |
| 11 | Jody | B | English | 7 |
| 12 | Susy | B | English | 8 |
+----+------+-------+---------+-------+
- 按照科目查看每个学生的成绩的相对排名,要查看在每个科目中每个学生按照成绩从高到低的相对排名。
mysql> SELECT *,
rank() OVER (
PARTITION BY subject
ORDER BY grade DESC
) "rank",
PERCENT_RANK() OVER (
PARTITION BY subject
ORDER BY grade DESC
) "percent_rank"
FROM grade;
+----+------+-------+---------+-------+------+--------------+
| id | name | class | subject | grade | rank | percent_rank |
+----+------+-------+---------+-------+------+--------------+
| 5 | Tom | A | English | 8 | 1 | 0 |
| 12 | Susy | B | English | 8 | 1 | 0 |
| 4 | Tim | A | English | 7 | 3 | 0.4 |
| 6 | Jim | A | English | 7 | 3 | 0.4 |
| 11 | Jody | B | English | 7 | 3 | 0.4 |
| 10 | Lucy | B | English | 6 | 6 | 1 |
| 1 | Tim | A | Math | 9 | 1 | 0 |
| 9 | Susy | B | Math | 9 | 1 | 0 |
| 3 | Jim | A | Math | 8 | 3 | 0.4 |
| 7 | Lucy | B | Math | 8 | 3 | 0.4 |
| 2 | Tom | A | Math | 7 | 5 | 0.8 |
| 8 | Jody | B | Math | 6 | 6 | 1 |
+----+------+-------+---------+-------+------+--------------+
- 窗口函数:
PERCENT_RANK() OVER (
PARTITION BY subject
ORDER BY grade DESC
)
- 按照班级查看每个学生的总成绩的相对排名,要查看在每个班级中每个学生按照总成绩从高到底的相对排名。
mysql> SELECT t.*,
PERCENT_RANK() OVER (
PARTITION BY class
ORDER BY t.sum_grade DESC
) "percent_rank"
FROM (
SELECT class,
name,
sum(grade) sum_grade
FROM grade
GROUP BY class, name
) t;
+-------+------+-----------+--------------+
| class | name | sum_grade | percent_rank |
+-------+------+-----------+--------------+
| A | Tim | 16 | 0 |
| A | Tom | 15 | 0.5 |
| A | Jim | 15 | 0.5 |
| B | Susy | 17 | 0 |
| B | Lucy | 14 | 0.5 |
| B | Jody | 13 | 1 |
+-------+------+-----------+--------------+
- 子查询使用 GROUP BY 子句和 sum() 按照班级和学生汇总出每个学生的总成绩。
mysql> SELECT class,
name,
sum(grade) sum_grade
FROM grade
GROUP BY class, name;
+-------+------+-----------+
| class | name | sum_grade |
+-------+------+-----------+
| A | Tim | 16 |
| A | Tom | 15 |
| A | Jim | 15 |
| B | Lucy | 14 |
| B | Jody | 13 |
| B | Susy | 17 |
+-------+------+-----------+
主语句在从这个子查询中按照班级进行分区,然后在每个分区内按照总成绩逆序排序,并使用 PERCENT_RANK()
返回每行在其关联的分区内的相对排名。
七、ROW_NUMBER 取分区序号
MySQL ROW_NUMBER() 函数返回当前行所在的分区内的序号,从 1 开始。
语法:
ROW_NUMBER()
OVER (
[PARTITION BY partition_column_list]
[ORDER BY order_column_list]
)
参数说明:
partition_column_list
:参与分区的列的列表。order_column_list
:参与排序的列的列表。
返回值:
- 当前行所在的分区内的序号,从 1 开始。
示例:
mysql> SELECT * FROM grade;
+----+------+-------+---------+-------+
| id | name | class | subject | grade |
+----+------+-------+---------+-------+
| 1 | Tim | A | Math | 9 |
| 2 | Tom | A | Math | 7 |
| 3 | Jim | A | Math | 8 |
| 4 | Tim | A | English | 7 |
| 5 | Tom | A | English | 8 |
| 6 | Jim | A | English | 7 |
| 7 | Lucy | B | Math | 8 |
| 8 | Jody | B | Math | 6 |
| 9 | Susy | B | Math | 9 |
| 10 | Lucy | B | English | 6 |
| 11 | Jody | B | English | 7 |
| 12 | Susy | B | English | 8 |
+----+------+-------+---------+-------+
- 按照科目查看每个学生的成绩的排序号,要查看在每个科目中每个学生按照成绩从高到低的排序号。
mysql> SELECT *,
ROW_NUMBER() OVER (
PARTITION BY subject
ORDER BY grade DESC
) "row_number"
FROM grade;
+----+------+-------+---------+-------+------------+
| id | name | class | subject | grade | row_number |
+----+------+-------+---------+-------+------------+
| 5 | Tom | A | English | 8 | 1 |
| 12 | Susy | B | English | 8 | 2 |
| 4 | Tim | A | English | 7 | 3 |
| 6 | Jim | A | English | 7 | 4 |
| 11 | Jody | B | English | 7 | 5 |
| 10 | Lucy | B | English | 6 | 6 |
| 1 | Tim | A | Math | 9 | 1 |
| 9 | Susy | B | Math | 9 | 2 |
| 3 | Jim | A | Math | 8 | 3 |
| 7 | Lucy | B | Math | 8 | 4 |
| 2 | Tom | A | Math | 7 | 5 |
| 8 | Jody | B | Math | 6 | 6 |
+----+------+-------+---------+-------+------------+
- 窗口函数:
ROW_NUMBER() OVER (
PARTITION BY subject
ORDER BY grade DESC
)
- 按照班级查看每个学生的总成绩的排序号,要查看在每个班级中每个学生按照总成绩从高到底的排序号。
mysql> SELECT t.*,
ROW_NUMBER() OVER (
PARTITION BY class
ORDER BY t.sum_grade DESC
) "row_number"
FROM (
SELECT class,
name,
sum(grade) sum_grade
FROM grade
GROUP BY class, name
) t;
+-------+------+-----------+------------+
| class | name | sum_grade | row_number |
+-------+------+-----------+------------+
| A | Tim | 16 | 1 |
| A | Tom | 15 | 2 |
| A | Jim | 15 | 3 |
| B | Susy | 17 | 1 |
| B | Lucy | 14 | 2 |
| B | Jody | 13 | 3 |
+-------+------+-----------+------------+
- 子查询使用 GROUP BY 子句和 sum() 按照班级和学生汇总出每个学生的总成绩。
mysql> SELECT class,
name,
sum(grade) sum_grade
FROM grade
GROUP BY class, name;
+-------+------+-----------+
| class | name | sum_grade |
+-------+------+-----------+
| A | Tim | 16 |
| A | Tom | 15 |
| A | Jim | 15 |
| B | Lucy | 14 |
| B | Jody | 13 |
| B | Susy | 17 |
+-------+------+-----------+
主语句在从这个子查询中按照班级进行分区,然后在每个分区内按照总成绩逆序排序,并使用 ROW_NUMBER()
返回每行在其关联的分区内的排序号。
八、LAG
MySQL LAG() 函数返回来自当前行所在的分区内当前行之前的指定行之内的值。
语法:
LAG(expr[, offset[, default]])
OVER (
[PARTITION BY partition_column_list]
[ORDER BY order_column_list]
)
参数说明:
expr
:必需的。它可以是一个列名或者表达式。offset
:可选的。相对于当前行的偏移的行数。默认值为 1。default
:可选的。它可以是一个列名或者表达式。partition_column_list
:参与分区的列的列表。order_column_list
:参与排序的列的列表。
返回值:
- 来自当前行所在的分区内当前行之前的指定行之内的值。
示例:
mysql> SELECT * FROM revenue;
+----+------+---------+---------+
| id | year | quarter | revenue |
+----+------+---------+---------+
| 1 | 2020 | 1 | 3515 |
| 2 | 2020 | 2 | 3678 |
| 3 | 2020 | 3 | 4203 |
| 4 | 2020 | 4 | 3924 |
| 5 | 2021 | 1 | 3102 |
| 6 | 2021 | 2 | 3293 |
| 7 | 2021 | 3 | 3602 |
| 8 | 2021 | 4 | 2901 |
+----+------+---------+---------+
- 使用 MySQL LAG() 函数和下一季度的收益比较,在使用 MySQL LAG() 函数在每行中添加 next_quarter_revenue 列以比较当前季度和下一季度的收益:
mysql> SELECT
*,
LAG(revenue, 1) OVER (
PARTITION BY year
ORDER BY quarter DESC
) next_quarter_revenue
FROM revenue;
+----+------+---------+---------+----------------------+
| id | year | quarter | revenue | next_quarter_revenue |
+----+------+---------+---------+----------------------+
| 4 | 2020 | 4 | 3924 | NULL |
| 3 | 2020 | 3 | 4203 | 3924 |
| 2 | 2020 | 2 | 3678 | 4203 |
| 1 | 2020 | 1 | 3515 | 3678 |
| 8 | 2021 | 4 | 2901 | NULL |
| 7 | 2021 | 3 | 3602 | 2901 |
| 6 | 2021 | 2 | 3293 | 3602 |
| 5 | 2021 | 1 | 3102 | 3293 |
+----+------+---------+---------+----------------------+
- 窗口函数:
LAG(revenue, 1) OVER (
PARTITION BY year
ORDER BY quarter DESC
) next_quarter_revenue
那么在 next_quarter_revenue
列中存放的就是当前行的下一个季度的收益。所以,每个分区的第一行中的 next_quarter_revenue
列的值为 null。
-
使用默认值:
mysql> SELECT
*,
LAG(revenue, 1, 0) OVER (
PARTITION BY year
ORDER BY quarter DESC
) next_quarter_revenue
FROM revenue;
+----+------+---------+---------+----------------------+
| id | year | quarter | revenue | next_quarter_revenue |
+----+------+---------+---------+----------------------+
| 4 | 2020 | 4 | 3924 | 0 |
| 3 | 2020 | 3 | 4203 | 3924 |
| 2 | 2020 | 2 | 3678 | 4203 |
| 1 | 2020 | 1 | 3515 | 3678 |
| 8 | 2021 | 4 | 2901 | 0 |
| 7 | 2021 | 3 | 3602 | 2901 |
| 6 | 2021 | 2 | 3293 | 3602 |
| 5 | 2021 | 1 | 3102 | 3293 |
+----+------+---------+---------+----------------------+
九、LEAD
MySQL LEAD() 函数返回来自当前行所在的分区内当前行之后的指定行之内的行的值。
语法:
LEAD(expr[, offset[, default]])
OVER (
[PARTITION BY partition_column_list]
[ORDER BY order_column_list]
)
参数说明:
expr
:必需的。它可以是一个列名或者表达式。offset
:可选的。相对于当前行的偏移的行数。默认值为 1。default
:可选的。它可以是一个列名或者表达式。partition_column_list
:参与分区的列的列表。order_column_list
:参与排序的列的列表。
返回值:
- 来自当前行所在的分区内当前行之后的指定行之内的行的值。
示例:
mysql> SELECT * FROM revenue;
+----+------+---------+---------+
| id | year | quarter | revenue |
+----+------+---------+---------+
| 1 | 2020 | 1 | 3515 |
| 2 | 2020 | 2 | 3678 |
| 3 | 2020 | 3 | 4203 |
| 4 | 2020 | 4 | 3924 |
| 5 | 2021 | 1 | 3102 |
| 6 | 2021 | 2 | 3293 |
| 7 | 2021 | 3 | 3602 |
| 8 | 2021 | 4 | 2901 |
+----+------+---------+---------+
- 使用 MySQL LEAD() 函数和上一季度的收益比较,使用 MySQL LEAD() 函数在每行中添加 last_quarter_revenue 列以比较当前季度和上一季度的收益:
mysql> SELECT
*,
LEAD(revenue, 1) OVER (
PARTITION BY year
ORDER BY quarter DESC
) last_quarter_revenue
FROM revenue;
+----+------+---------+---------+----------------------+
| id | year | quarter | revenue | last_quarter_revenue |
+----+------+---------+---------+----------------------+
| 4 | 2020 | 4 | 3924 | 4203 |
| 3 | 2020 | 3 | 4203 | 3678 |
| 2 | 2020 | 2 | 3678 | 3515 |
| 1 | 2020 | 1 | 3515 | NULL |
| 8 | 2021 | 4 | 2901 | 3602 |
| 7 | 2021 | 3 | 3602 | 3293 |
| 6 | 2021 | 2 | 3293 | 3102 |
| 5 | 2021 | 1 | 3102 | NULL |
+----+------+---------+---------+----------------------+
- 窗口函数:
LEAD(revenue, 1) OVER (
PARTITION BY year
ORDER BY quarter DESC
) last_quarter_revenue
那么在 last_quarter_revenue
列中存放的就是当前行的上一个季度的收益。所以,每个分区的最后一行中的 last_quarter_revenue
列的值为 null。
- 使用默认值:
mysql> SELECT
*,
LEAD(revenue, 1, 0) OVER (
PARTITION BY year
ORDER BY quarter DESC
) last_quarter_revenue
FROM revenue;
+----+------+---------+---------+----------------------+
| id | year | quarter | revenue | last_quarter_revenue |
+----+------+---------+---------+----------------------+
| 4 | 2020 | 4 | 3924 | 4203 |
| 3 | 2020 | 3 | 4203 | 3678 |
| 2 | 2020 | 2 | 3678 | 3515 |
| 1 | 2020 | 1 | 3515 | 0 |
| 8 | 2021 | 4 | 2901 | 3602 |
| 7 | 2021 | 3 | 3602 | 3293 |
| 6 | 2021 | 2 | 3293 | 3102 |
| 5 | 2021 | 1 | 3102 | 0 |
+----+------+---------+---------+----------------------+
十、NTILE
MySQL NTILE() 函数将当前行所在的分区内的所有行尽可能平均的分成指定数量的区间,并返回当前行所在的区间编号。
每个区间, MySQL 称之为一个排名桶。 NTILE()
根据指定排序为每个桶指设定排名。
语法:
NTILE(buckets)
OVER (
[PARTITION BY partition_column_list]
[ORDER BY order_column_list]
)
参数说明:
buckets
:必需的。桶的数量。桶的数量最大为此分区内的行的数量。partition_column_list
:参与分区的列的列表。order_column_list
:参与排序的列的列表。
返回值:
- 将当前行所在的分区内的所有行尽可能平均的分成指定数量的排名桶,并返回当前行所在的桶的排名。
示例:
- 下面语句展示使用
NTILE(3)
将他们按照升序分成 3 个桶,按照尽可能平均分配的原则,那么 1-3 的桶排名是 1, 4-6 的桶排名是 2, 7-9 的桶排名是 3。
mysql> SELECT
x,
NTILE(3) over (
ORDER BY x
) "ntile"
FROM (
SELECT 1 x
UNION
SELECT 2 x
UNION
SELECT 3 x
UNION
SELECT 4 x
UNION
SELECT 5 x
UNION
SELECT 6 x
UNION
SELECT 7 x
UNION
SELECT 8 x
UNION
SELECT 9 x
) t;
+---+-------+
| x | ntile |
+---+-------+
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 2 |
| 5 | 2 |
| 6 | 2 |
| 7 | 3 |
| 8 | 3 |
| 9 | 3 |
+---+-------+
mysql> SELECT * FROM revenue;
+----+------+---------+---------+
| id | year | quarter | revenue |
+----+------+---------+---------+
| 1 | 2020 | 1 | 3515 |
| 2 | 2020 | 2 | 3678 |
| 3 | 2020 | 3 | 4203 |
| 4 | 2020 | 4 | 3924 |
| 5 | 2021 | 1 | 3102 |
| 6 | 2021 | 2 | 3293 |
| 7 | 2021 | 3 | 3602 |
| 8 | 2021 | 4 | 2901 |
+----+------+---------+---------+
- 使用 MySQL
NTILE()
函数将每年的收益按照升序分成 2 桶:
mysql> SELECT
*,
NTILE(2) OVER (
PARTITION BY year
ORDER BY revenue
) "ntile"
FROM revenue;
+----+------+---------+---------+-------+
| id | year | quarter | revenue | ntile |
+----+------+---------+---------+-------+
| 1 | 2020 | 1 | 3515 | 1 |
| 2 | 2020 | 2 | 3678 | 1 |
| 4 | 2020 | 4 | 3924 | 2 |
| 3 | 2020 | 3 | 4203 | 2 |
| 8 | 2021 | 4 | 2901 | 1 |
| 5 | 2021 | 1 | 3102 | 1 |
| 6 | 2021 | 2 | 3293 | 2 |
| 7 | 2021 | 3 | 3602 | 2 |
+----+------+---------+---------+-------+
- 窗口函数:
NTILE(2) OVER (
PARTITION BY year
ORDER BY revenue
)
十一、RANK
MySQL RANK() 函数返回当前行所在的分区内的排名,从 1 开始,但有间隔。即,相同的值具有相同的排名,但是下一个不同的值的排名采用 row_number() 编号。比如,如果有 2 个第一名,那么第三位的排名是 3。这与 dense_rank() 函数是不同的。
语法:
RANK()
OVER (
[PARTITION BY partition_column_list]
[ORDER BY order_column_list]
)
参数说明:
partition_column_list
:参与分区的列的列表。order_column_list
:参与排序的列的列表。
返回值:
- 当前行所在的分区内的排名,从 1 开始,但有间隔。
示例:
mysql> SELECT * FROM grade;
+----+------+-------+---------+-------+
| id | name | class | subject | grade |
+----+------+-------+---------+-------+
| 1 | Tim | A | Math | 9 |
| 2 | Tom | A | Math | 7 |
| 3 | Jim | A | Math | 8 |
| 4 | Tim | A | English | 7 |
| 5 | Tom | A | English | 8 |
| 6 | Jim | A | English | 7 |
| 7 | Lucy | B | Math | 8 |
| 8 | Jody | B | Math | 6 |
| 9 | Susy | B | Math | 9 |
| 10 | Lucy | B | English | 6 |
| 11 | Jody | B | English | 7 |
| 12 | Susy | B | English | 8 |
+----+------+-------+---------+-------+
- 按照科目查看每个学生的成绩排名,要查看在每个科目中每个学生按照成绩从高到低的排名。
mysql> SELECT *,
RANK() OVER (
PARTITION BY subject
ORDER BY grade DESC
) "rank",
dense_rank() OVER (
PARTITION BY subject
ORDER BY grade DESC
) "dense_rank"
FROM grade;
+----+------+-------+---------+-------+------+------------+
| id | name | class | subject | grade | rank | dense_rank |
+----+------+-------+---------+-------+------+------------+
| 5 | Tom | A | English | 8 | 1 | 1 |
| 12 | Susy | B | English | 8 | 1 | 1 |
| 4 | Tim | A | English | 7 | 3 | 2 |
| 6 | Jim | A | English | 7 | 3 | 2 |
| 11 | Jody | B | English | 7 | 3 | 2 |
| 10 | Lucy | B | English | 6 | 6 | 3 |
| 1 | Tim | A | Math | 9 | 1 | 1 |
| 9 | Susy | B | Math | 9 | 1 | 1 |
| 3 | Jim | A | Math | 8 | 3 | 2 |
| 7 | Lucy | B | Math | 8 | 3 | 2 |
| 2 | Tom | A | Math | 7 | 5 | 3 |
| 8 | Jody | B | Math | 6 | 6 | 4 |
+----+------+-------+---------+-------+------+------------+
- 窗口函数:
RANK() OVER (
PARTITION BY subject
ORDER BY grade DESC
)
可以看出 RANK()
与 dense_rank()
的不同。
- 按照班级查看每个学生的总成绩排名,要查看在每个班级中每个学生按照总成绩从高到底的排名。
mysql> SELECT t.*,
RANK() OVER (
PARTITION BY class
ORDER BY t.sum_grade DESC
) "rank"
FROM (
SELECT class,
name,
sum(grade) sum_grade
FROM grade
GROUP BY class, name
) t;
+-------+------+-----------+------+
| class | name | sum_grade | rank |
+-------+------+-----------+------+
| A | Tim | 16 | 1 |
| A | Tom | 15 | 2 |
| A | Jim | 15 | 2 |
| B | Susy | 17 | 1 |
| B | Lucy | 14 | 2 |
| B | Jody | 13 | 3 |
+-------+------+-----------+------+
- 子查询使用 GROUP BY 子句和 sum() 按照班级和学生汇总出每个学生的总成绩。
mysql> SELECT class,
name,
sum(grade) sum_grade
FROM grade
GROUP BY class, name;
+-------+------+-----------+
| class | name | sum_grade |
+-------+------+-----------+
| A | Tim | 16 |
| A | Tom | 15 |
| A | Jim | 15 |
| B | Lucy | 14 |
| B | Jody | 13 |
| B | Susy | 17 |
+-------+------+-----------+
主语句在从这个子查询中按照班级进行分区,然后在每个分区内按照总成绩逆序排序,并使用 RANK()
返回每行在其关联的分区内的排名。