问题描述
我在PostgreSQL中有一个这样的表。我想基于ID(这是主键)对每16条记录执行诸如均值和最大值的聚合函数。例如,我必须计算前16个记录和后16个记录的平均值,依此类推。
I have a table like this in PostgreSQL. I want to perform aggregation functions like mean and max for every 16 records based on ID (which is primary key). For example I have to calculate mean value for first 16 records and second 16 records and so on.
+-----+------------- | ID | rainfall | +-----+----------- | | 1 | 110.2 | | 2 | 56.6 | | 3 | 65.6 | | 4 | 75.9 | +-----+------------
推荐答案
想到的第一种方法是使用 row_number()注释表,然后按16行的块进行分组。
The 1st approach that comes to mind is to use row_number() to annotate the table, then group by blocks of 16 rows.
SELECT min(id) as first_id, max(id) AS last_id, avg(rainfall) AS avg_this_16 FROM ( SELECT id, rainfall, row_number() OVER (order by id) AS n FROM the_table ) x(id,rainfall,n) GROUP BY n/16 ORDER BY n/16;
请注意,这不一定包括最后一组的16个样本。
Note that this won't necessarily include 16 samples for the last group.
或者,您可以使用 avg()作为窗口函数来计算运行平均值:
Alternately you can calculate a running average by using avg() as a window function:
SELECT id, avg(rainfall) OVER (ORDER BY id ROWS 15 PRECEDING) FROM the_table;
...可能用行号注释并选择所需的行号:
... possibly annotating that with the row number and selecting the ones you want:
SELECT id AS greatest_id_in_group, avg_last_16_inclusive FROM ( SELECT id, avg(rainfall) OVER (ORDER BY id ROWS 15 PRECEDING) AS avg_last_16_inclusive, row_number() OVER (ORDER BY id) AS n FROM the_table ) x WHERE n % 16 = 0;
这将忽略最后n个
This will disregard the last n<16 samples, not returning a row for them.
请注意,我假设ID不能保证是连续的。如果它们之间没有间隙,则可以按id / 16 分组并避免使用窗口功能。
Note that I'm assuming the IDs aren't guaranteed to be contiguous. If they are gap-less, you can just group by id/16 and avoid the window function.
这篇关于将每N个值分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!