问题描述
免责声明:所示问题比我最初预期的要普遍得多。下面的示例来自一个解决方案,另一个问题。但是,现在我拿这个样本来解决更多问题-主要与时间序列有关(请看一下右侧栏中的链接部分)。
Disclaimer: The shown problem is much more general than I expected first. The example below is taken from a solution to another question. But now I was taking this sample for solving many problems more - mostly related to time series (have a look at the "Linked" section in the right bar).
所以我先尝试更笼统地解释这个问题:
So I am trying to explain the problem more generally first:
我正在使用PostgreSQL,但是我确信这个问题在支持DBMS的其他窗口功能(MS SQL Server,Oracle等)中也存在。
可用于将某些值组合在一起通过共同的属性或值。例如,您可以按日期对行进行分组。然后,您可以计算每个日期内的最大值,平均值或计数行数之类的值。
Window functions can be used to group certain values together by a common attribute or value. For example you can group rows by a date. Then you are able to calculate the max value within every single date or an average value or counting rows or whatever.
这可以通过定义 PARTITION 。按日期分组将与 strong>?
The question is: How to get the group ids with respect to the order by ts?
编辑:我在下面添加了自己的解决方案,但是我对此感到非常不舒服它。似乎太复杂了。 我想知道是否有更好的方法来达到这个结果。
I added an own solution below but I feel very uncomfortable with it. It seems way too complicated. I was wondering if there's a better way to achieve this result.
推荐答案
我想到了这个我自己的解决方案(希望别人会得到更好的):
I came up with this solution by myself (hoping someone else will get a better one):
- order通过 ts
- 给出下一个 val 值,并带有 lag 窗口函数()
- 检查下一个和当前值是否相同。然后我可以打印出 0 或 1
- 这些总和有序 SUM 的值。这将生成我要寻找的组。他们将 val 列分组,但是通过 ts 列确保排序。
- order by ts
- give out the next val value with the lag window function (https://www.postgresql.org/docs/current/static/tutorial-window.html)
- check if the next and the current values are the same. Then I can print out a 0 or a 1
- sum up these values with an ordered SUM. This generates the groups I am looking for. They group the val column but ensure the ordering by the ts column.
查询:
SELECT *, SUM(is_diff) OVER (ORDER BY ts) FROM ( SELECT *, CASE WHEN val = lag(val) over (order by ts) THEN 0 ELSE 1 END as is_diff FROM test )s
结果:
ts val is_diff sum 100000 50 1 1 130100 30050 1 2 160100 60050 1 3 190200 100 1 4 220200 30100 1 5 \ group 250200 30100 0 5 / 300000 300 1 6 500000 100 1 7 550000 1000 1 8 \ group 600000 1000 0 8 / 650000 2000 1 9 \ 700000 2000 0 9 | group 720000 2000 0 9 / 750000 300 1 10
这篇关于窗口函数:在第一列之后按另一列划分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!