我在PostgreSQL数据库中有以下两个表:
dummy=# select * from employee;
id | name
----+-------
1 | John
2 | Susan
3 | Jim
4 | Sarah
(4 rows)
dummy=# select * from stats;
id | arrival | day | employee_id
----+----------+------------+-------------
2 | 08:31:34 | monday | 2
4 | 08:15:00 | monday | 3
5 | 08:43:00 | monday | 4
1 | 08:34:00 | monday | 1
7 | 08:29:00 | midweek | 1
8 | 08:31:00 | midweek | 2
9 | 08:10:00 | midweek | 3
10 | 08:40:00 | midweek | 4
11 | 08:28:00 | midweek | 1
12 | 08:33:00 | midweek | 2
14 | 08:21:00 | midweek | 3
15 | 08:45:00 | midweek | 4
16 | 08:25:00 | midweek | 1
17 | 08:35:00 | midweek | 2
18 | 08:44:00 | midweek | 4
19 | 08:10:00 | friday | 1
20 | 08:40:00 | friday | 2
21 | 08:30:00 | friday | 3
22 | 08:30:00 | friday | 4
(19 rows)
我想选择所有在
8:25
和8:35
之间到达的员工。我可以通过以下查询相对容易地完成此任务:SELECT * FROM stats
WHERE
arrival >= (time '8:30' - interval '5 minutes')
AND
arrival <= (time '8:30' + interval '5 minutes')
AND
(day = 'midweek' or day = 'friday');
但是,另一个标准是,我只想选择那些在上述时间窗口内至少到达60%时间的员工。这就是我被困的地方。我不知道如何计算这个比率。
满足所有条件的查询是什么样子的?
澄清
显然,上述对比率的描述是误导性的。
在计算比率时,仅应考虑符合标准
midweek
的行。所以在样本数据中,john和susan四次出现在friday
和(day = 'midweek' or day = 'friday')
上。四次中有三次是准时的。因此,苏珊和约翰的比率是midweek
。 最佳答案
使用公共表表达式计算所需计数,例如。
with in_time as (
select *
from stats
where arrival >= (time '8:30' - interval '5 minutes')
and arrival <= (time '8:30' + interval '5 minutes')
and (day = 'midweek' or day = 'friday')
),
count_in_time as (
select employee_id, count(*)
from in_time
group by employee_id
),
total_count as (
select employee_id, count(*)
from stats
where day = 'midweek' or day = 'friday'
group by employee_id
)
select
i.*,
c.count as in_time,
t.count as total_count,
round(c.count* 100.0/t.count, 2) as ratio
from in_time i
join count_in_time c using(employee_id)
join total_count t using(employee_id);
结果:
id | arrival | day | employee_id | in_time | total_count | ratio
----+----------+---------+-------------+---------+-------------+-------
16 | 08:25:00 | midweek | 1 | 3 | 4 | 75.00
11 | 08:28:00 | midweek | 1 | 3 | 4 | 75.00
7 | 08:29:00 | midweek | 1 | 3 | 4 | 75.00
17 | 08:35:00 | midweek | 2 | 3 | 4 | 75.00
12 | 08:33:00 | midweek | 2 | 3 | 4 | 75.00
8 | 08:31:00 | midweek | 2 | 3 | 4 | 75.00
21 | 08:30:00 | friday | 3 | 1 | 3 | 33.33
22 | 08:30:00 | friday | 4 | 1 | 4 | 25.00
(8 rows)
您可以在最终查询的where子句中添加适当的条件。
如果只想获取员工及其比率的聚合数据,请将count()与filter一起使用:
select employee_id, name, in_time* 1.0/ total as ratio
from (
select
employee_id,
count(*) filter (where arrival >= time '8:30' - interval '5 minutes' and arrival <= time '8:30' + interval '5 minutes') as in_time,
count(*) as total
from stats
where day in ('midweek', 'friday')
group by 1
) s
join employee e on e.id = s.employee_id
where in_time* 1.0/ total >= 0.6;
employee_id | name | ratio
-------------+-------+------------------------
1 | John | 0.75000000000000000000
2 | Susan | 0.75000000000000000000
(2 rows)
关于database - Postgres:计算匹配条件的表条目的比率,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/47079995/