问题描述
CREATE TABLE dialoglog (
userid int,
appid int,
type varchar(100) ,
ds timestamp
);
数据
userid appid type ds
1 1 imp January, 19 2016 03:14:07
1 1 click January, 20 2016 03:14:07
2 1 imp January, 19 2016 03:14:07
2 1 click January, 18 2016 03:14:07
3 6 imp January, 19 2016 03:14:07
10 1 click January, 19 2016 03:14:07
1 1 imp January, 20 2016 03:12:07
问题
每个appid的点击率是多少?
Question
What is the CTR (click/imp) per appid ?
我们要按组打印点击率
问题在于我们有不良"数据:
Problem is that we have "bad" data:
- 用户10单击appid 1,但没有对appid 1的印象=>我们应该忽略它
- 用户2在对appid 1有印象之前点击了appid 1 =>我们应该忽略它
因此,这样的查询:
SELECT SUM(CASE WHEN type='click' THEN 1 else 0 END)*1.0/SUM(CASE WHEN dialoglog.type='imp' THEN 1 else 0 END) as ctr,appid
FROM dialoglog GROUP BY appid;
由于我刚才所说的限制而无法工作. 是否可以在单个mysql查询中编写所有这些约束代码?
Would not work because of the constrain I just said. Is it possible to code all these constrain in a single mysql query?
这是 sqlfiddle 可以玩
对于appid 1,我们有3次展示和1次点击(这是有效的,因为对于用户2的时间戳单击了<时间戳的印象,而对于用户10则没有印象),因此我们的点击率应为1/3.
For appid 1 we have 3 impression and 1 click (that is valid since for user 2 timestamp clicked < timestamp impression and for user 10 there was no impression) therefore we should have a CTR of 1/3.
对于appid 6,我们只有一次展示=> CTR = 0
For appid 6, we only have one impression => CTR = 0
因此,我期望得到这样的结果:
Therefore I am expecting this result:
appid CTR
1 0.333
6 0
到目前为止我尝试过的事情
我考虑过进行左连接,但是问题是它将复制右表:
What I tried so far
I thought about doing left join but the problem is that it will duplicate the right table:
SELECT * FROM dialoglog t1 LEFT JOIN dialoglog t2 on t1.userid=t2.userid and t1.appid=t2.appid and t1.type<>t2.type WHERE t1.type="imp" and t1.ds<t2.ds;
userid appid type ds userid appid type ds
1 1 imp January, 19 2016 03:14:07 1 1 click January, 20 2016 03:14:07
1 1 imp January, 19 2016 03:12:07 1 1 click January, 20 2016 03:14:07
推荐答案
您似乎需要为每个用户提供最早的"imp".然后,您可以按该时间点之后的交易进行汇总:
You seem to need the earliest "imp" for each user. Then you can aggregate by the transactions after that point in time:
SELECT dl.appid, SUM(dl.type = 'click') / SUM(dl.type = 'imp')
FROM dialoglog dl LEFT JOIN
(SELECT dl2.userId, MIN(ds) as imp_ds
FROM dialoglog dl2
WHERE type = 'imp'
GROUP BY dl2.userid
) dlimp
ON dl.userid = dlimp.userid
WHERE dl.ds >= dlimp.imp_ds
GROUP BY dl.appid;
这篇关于MySQL条件自连接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!