问题描述
我需要查询方面的帮助,假设这是表中的数据.
I need help with the query, let's say that this is the data in table.
timestamp
-------------------
2010-11-16 10:30:00
2010-11-16 10:37:00
2010-11-16 10:40:00
2010-11-16 10:45:00
2010-11-16 10:48:00
2010-11-16 10:55:00
2010-11-16 10:56:00
我希望每行第一行(时间戳)至少比最后一行晚5分钟.在这种情况下,查询应返回:
I want to get every first row (timestamp) that is at least 5 minutes later than the last. In this case the query should return:
timestamp
-------------------
2010-11-16 10:30:00
2010-11-16 10:37:00
2010-11-16 10:45:00
2010-11-16 10:55:00
推荐答案
递归CTE
由于每一行都取决于之前的一行,因此很难使用基于集合的方法来解决.借助递归CTE (这是标准SQL):
Recursive CTE
Since each row depends on the one before, it is hard to solve with a set-based approach. Resorting to a recursive CTE (which is standard SQL):
WITH RECURSIVE cte AS (
(SELECT ts FROM tbl
ORDER BY ts
LIMIT 1)
UNION ALL
(SELECT t.ts
FROM cte c
JOIN tbl t ON t.ts >= c.ts + interval '5 min'
ORDER BY t.ts
LIMIT 1)
)
SELECT * FROM cte ORDER BY ts;
请注意我的初稿的更新内容:
递归CTE中不允许使用聚合函数.我用ORDER BY
/LIMIT 1
代替,当ts
上的 index 支持时,应该很快.
Note the update from my first draft:
Aggregate functions are not allowed in a recursive CTE. I substituted with ORDER BY
/ LIMIT 1
, which should be fast when supported by an index on ts
.
允许使用LIMIT
时,必须在UNION
查询的每个分支周围加上括号,否则只能在UNION
查询结束时允许一次.
The parentheses around each leg of the UNION
query are necessary to allow LIMIT
, which would otherwise only be permitted once at the end of a UNION
query.
迭代排序表的过程解决方案(例如,带有plpgsql函数的示例)可能会快很多,因为它可以通过单个表扫描来完成:
A procedural solution (example with a plpgsql function) iterating through the sorted table would probably be a lot faster, since it can make do with a single table scan:
CREATE OR REPLACE FUNCTION f_rowgrid(i interval)
RETURNS SETOF timestamp AS
$func$
DECLARE
_this timestamp;
_last timestamp := '-infinity'; -- init so that 1 row passes
BEGIN
FOR _this IN
SELECT ts FROM tbl ORDER BY 1
LOOP
IF _this >= _last + i THEN
RETURN NEXT _this;
_last := _this;
END IF;
END LOOP;
END
$func$ LANGUAGE plpgsql;
致电:
SELECT * FROM f_rowgrid('5 min')
SQL小提琴 展示了这两者.
SQL Fiddle demonstrating both.
以下是这种plpgsql函数的一个稍微复杂的示例:
Here is a somewhat more complex example for this type of plpgsql function:
可以很容易地通过动态SQL通用化,而EXECUTE
可以用于任意表.
Could easily be made generic with dynamic SQL and EXECUTE
to work for arbitrary tables.
这篇关于MySQL/Postgres查询5分钟间隔数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!