Postgres查询5分钟间隔数据

Postgres查询5分钟间隔数据

本文介绍了MySQL/Postgres查询5分钟间隔数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要查询方面的帮助,假设这是表中的数据.

I need help with the query, let's say that this is the data in table.

timestamp
-------------------
2010-11-16 10:30:00
2010-11-16 10:37:00
2010-11-16 10:40:00
2010-11-16 10:45:00
2010-11-16 10:48:00
2010-11-16 10:55:00
2010-11-16 10:56:00

我希望每行第一行(时间戳)至少比最后一行晚5分钟.在这种情况下,查询应返回:

I want to get every first row (timestamp) that is at least 5 minutes later than the last. In this case the query should return:

timestamp
-------------------
2010-11-16 10:30:00
2010-11-16 10:37:00
2010-11-16 10:45:00
2010-11-16 10:55:00

推荐答案

递归CTE

由于每一行都取决于之前的一行,因此很难使用基于集合的方法来解决.借助递归CTE (这是标准SQL):

Recursive CTE

Since each row depends on the one before, it is hard to solve with a set-based approach. Resorting to a recursive CTE (which is standard SQL):

WITH RECURSIVE cte AS (
   (SELECT ts FROM tbl
    ORDER  BY ts
    LIMIT  1)

   UNION ALL
   (SELECT t.ts
    FROM   cte c
    JOIN   tbl t ON t.ts >= c.ts + interval '5 min'
    ORDER  BY t.ts
    LIMIT  1)
   )
SELECT * FROM cte ORDER BY ts;

请注意我的初稿的更新内容:
递归CTE中不允许使用聚合函数.我用ORDER BY/LIMIT 1代替,当ts上的 index 支持时,应该很快.

Note the update from my first draft:
Aggregate functions are not allowed in a recursive CTE. I substituted with ORDER BY / LIMIT 1, which should be fast when supported by an index on ts.

允许使用LIMIT时,必须在UNION查询的每个分支周围加上括号,否则只能在UNION查询结束时允许一次.

The parentheses around each leg of the UNION query are necessary to allow LIMIT, which would otherwise only be permitted once at the end of a UNION query.

迭代排序表的过程解决方案(例如,带有plpgsql函数的示例)可能会快很多,因为它可以通过单个表扫描来完成:

A procedural solution (example with a plpgsql function) iterating through the sorted table would probably be a lot faster, since it can make do with a single table scan:

CREATE OR REPLACE FUNCTION f_rowgrid(i interval)
  RETURNS SETOF timestamp AS
$func$
DECLARE
   _this  timestamp;
   _last  timestamp := '-infinity';     -- init so that 1 row passes
BEGIN

FOR _this IN
    SELECT ts FROM tbl ORDER BY 1
LOOP
    IF _this >= _last + i THEN
       RETURN NEXT _this;
       _last := _this;
    END IF;
END LOOP;

END
$func$ LANGUAGE plpgsql;

致电:

SELECT * FROM  f_rowgrid('5 min')

SQL小提琴 展示了这两者.

SQL Fiddle demonstrating both.

以下是这种plpgsql函数的一个稍微复杂的示例:

Here is a somewhat more complex example for this type of plpgsql function:

可以很容易地通过动态SQL通用化,而EXECUTE可以用于任意表.

Could easily be made generic with dynamic SQL and EXECUTE to work for arbitrary tables.

这篇关于MySQL/Postgres查询5分钟间隔数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-01 03:33