干杯,
使用Postgres表
CREATE TABLE my_table (
"id" serial,
"sensorid" integer,
"actorid" integer,
"timestamp" timestamp without time zone,
)
使用示例数据
id, sensorid, actorid, timestamp
1; 2267; 3023; "2013-07-09 12:20:06.446"
2; 2267; 3023; "2013-07-09 12:20:16.421"
3; 2267; 3023; "2013-07-09 12:20:30.661"
4; 2267; 3023; "2013-07-09 12:20:36.958"
5; 2267; 3023; "2013-07-09 12:20:49.508"
6; 2267; 3023; "2013-07-09 12:20:57.683"
7; 3301; 3023; "2013-08-15 06:03:03.428"
8; 2267; 3024; "2013-07-09 12:19:52.196"
9; 2267; 3024; "2013-07-09 12:20:16.515"
10; 2267; 3024; "2013-07-09 12:20:42.341"
11; 2267; 3025; "2013-07-09 12:21:05.98"
12; 2268; 3026; "2013-07-09 12:22:35.03"
13; 2268; 3026; "2013-07-09 12:22:45.066"
14; 3192; 3026; "2013-08-09 07:41:31.206"
我想用以下标准对记录进行分组
他们有相同的感觉器官
他们有相同的actorid
(问题是:)它们之间的时间间隔小于(比如说)5分钟。也就是说,可能有一个组的时间跨度超过一个小时,但组中两个记录之间的间隔不超过5分钟。时间跨度可以聚合为平均值。
此外,必须给出每个组的聚合记录数,因为必须标识过大的组。
所以,输出应该类似于
id; sensorid, actorid, avg, count
1; 2267; 3023; "2013-07-09 12:20:30.000"; 7;
2; 3301; 3023; "2013-08-15 06:03:03.428"; 1;
3; 2267; 3024; "2013-07-09 12:20:06.415"; 3;
5; 2267; 3025; "2013-07-09 12:21:05.98"; 1;
6; 2268; 3026; "2013-07-09 12:22:40.626"; 2;
7; 3192; 3026; "2013-08-09 07:41:31.206"; 1;
谢谢你的帮助!
丹尼斯
最佳答案
首先,要使用lag()
来确定上一个时间是,以及它是否开始新的周期。然后,对于每个sensorid/actorid组合,您可以做一个isStart
的累积和来确定每对的组。
然后进行聚合,将此新组包含在结果中:
select sensorid, actorid, min(timestamp), max(timestamp), count(*) as numInGroup
from (select t.*,
sum(isStart) over (partition by sensorid, actorid order by timestamp) as grp
from (select t.*,
(case when prevts is null or prevts < timestamp - interval '5 minutes'
then 1 else 0
end) as isStart
from (select t.*,
lag(timestamp) over (partition by sensorid, actorid
order by timestamp) as prevts
from my_table t
) t
) t
) t
group by sensorid, actorid, grp