本文介绍了GROUP BY以空格分隔的连续日期的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设您(在Postgres 9.1中)有一个这样的表:

Assume you have (in Postgres 9.1 ) a table like this:

date | value

其中有一些差距(我的意思是:并非min(date)和

which have some gaps in it (I mean: not every possible date between min(date) and max(date) has it's row).

我的问题是如何汇总这些数据,以使每个一致的组(无间隙)被分别对待,如下所示:

My problem is how to aggregate this data so that each consistent group (without gaps) is treated separately, like this:

min_date | max_date | [some aggregate of "value" column]

任何想法都该怎么做?我相信可以使用窗口函数,但过一会儿尝试使用 lag() lead()

Any ideas how to do it? I believe it is possible with window functions but after a while trying with lag() and lead() I'm a little stuck.

例如,如果数据是这样的:

For instance if the data are like this:

 date          | value
---------------+-------
 2011-10-31    | 2
 2011-11-01    | 8
 2011-11-02    | 10
 2012-09-13    | 1
 2012-09-14    | 4
 2012-09-15    | 5
 2012-09-16    | 20
 2012-10-30    | 10

输出(对于 sum 作为总计)将为:

the output (for sum as the aggregate) would be:

   min     |    max     |  sum
-----------+------------+-------
2011-10-31 | 2011-11-02 |  20
2012-09-13 | 2012-09-16 |  30
2012-10-30 | 2012-10-30 |  10


推荐答案

create table t ("date" date, "value" int);
insert into t ("date", "value") values
    ('2011-10-31', 2),
    ('2011-11-01', 8),
    ('2011-11-02', 10),
    ('2012-09-13', 1),
    ('2012-09-14', 4),
    ('2012-09-15', 5),
    ('2012-09-16', 20),
    ('2012-10-30', 10);

简单便宜的版本:

select min("date"), max("date"), sum(value)
from (
    select
        "date", value,
        "date" - (dense_rank() over(order by "date"))::int g
    from t
) s
group by s.g
order by 1

我的第一次尝试更为复杂和昂贵:

My first try was more complex and expensive:

create temporary sequence s;
select min("date"), max("date"), sum(value)
from (
    select
        "date", value, d,
        case
            when lag("date", 1, null) over(order by s.d) is null and "date" is not null
                then nextval('s')
            when lag("date", 1, null) over(order by s.d) is not null and "date" is not null
                then lastval()
            else 0
        end g
    from
        t
        right join
        generate_series(
            (select min("date") from t)::date,
            (select max("date") from t)::date + 1,
            '1 day'
        ) s(d) on s.d::date = t."date"
) q
where g != 0
group by g
order by 1
;
drop sequence s;

输出:

    min     |    max     | sum
------------+------------+-----
 2011-10-31 | 2011-11-02 |  20
 2012-09-13 | 2012-09-16 |  30
 2012-10-30 | 2012-10-30 |  10
(3 rows)

这篇关于GROUP BY以空格分隔的连续日期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-29 03:23