话不多说, 先看数据表信息.
数据表信息:
survey_log 表格的字段有:uid, action, question_id, answer_id, q_num, timestamp。
- uid是用户id;
- action的值为:“show”, “answer”, “skip”;
- 当action是"answer"时,answer_id不为空
- 相反,当action是"show"和"skip"时answer_id为空(null);
- q_num是问题的数字序号。
+-----+---------+-------------+-----------+-------+-----------+
| uid | action | question_id | answer_id | q_num | timestamp |
+-----+---------+-------------+-----------+-------+-----------+
| 5 | show | 285 | NULL | 1 | 123 |
| 5 | answer | 285 | 124124 | 1 | 124 |
| 5 | show | 369 | NULL | 2 | 125 |
| 5 | skip | 369 | NULL | 2 | 126 |
+-----+---------+-------------+-----------+-------+-----------+
话不多说, 再看需求~
需求:
写一条sql语句找出回答率(show 出现次数 / answer 出现次数)最高的 question_id。
预计结果输出如下:
+----+----------+
| question_id |
+----+----------+
| 285 |
+----+----------+
话不多说, 进行拆解~
拆解:
这里的想法是使用case when
函数根据条件进行设置, 再结合对题目编号question_id
进行分组聚合得到最后的结果.
select t.question_id
from (
select question_id,
case action when 'show' then 1 else 0 end as show_num,
case action when 'answer' then 1 else 0 end as answer_num
from survey_log
) as t
group by t.question_id
order by sum(t.show_num) / sum(t.answer_num) desc
limit 1
;
结果如下:
最后给大家介绍一下我这边的创建数据表和插入数据的操作步骤, 想要自己测试的话, 可以参考:
CREATE TABLE survey_log (
uid INT,
action VARCHAR(50),
question_id INT,
answer_id INT,
q_num INT,
timestamp INT
);
INSERT INTO survey_log (uid, action, question_id, answer_id, q_num, timestamp)
VALUES (5, 'show', 285, NULL, 1, 123),
(5, 'answer', 285, 124124, 1, 124),
(5, 'show', 369, NULL, 2, 125),
(5, 'skip', 369, NULL, 2, 126);
数据信息如下: