问题描述
我每个月都有时间序列数据,我想对每个ID的值求和,按月分组,然后将月名作为列而不是行.
I have time series data on a monthly basis and I would like to sum values for each ID, grouped by month and then have the month names as columns rather than as rows.
+----+------------+-------+-------+
| id | extra_info | month | value |
+----+------------+-------+-------+
| 1 | abc | jan | 10 |
| 1 | abc | feb | 20 |
| 2 | def | jan | 10 |
| 2 | def | feb | 5 |
| 1 | abc | jan | 15 |
| 3 | ghi | mar | 15 |
所需结果
+----+------------+-----+-----+-----+
| id | extra_info | jan | feb | mar |
+----+------------+-----+-----+-----+
| 1 | abc | 25 | 20 | 0 |
| 2 | def | 10 | 5 | 0 |
| 3 | ghi | 0 | 0 | 15 |
当前方法
我可以轻松地按月分组,对值求和.这使我能够:
Current Approach
I can easily group by month, summing the values. Which gets me to:
-----------------------------------
| id | extra_info | month | value |
+----+------------+-------+-------+
| 1 | abc | jan | 25 |
| 1 | abc | feb | 20 |
| 2 | def | jan | 10 |
| 2 | def | feb | 5 |
| 3 | ghi | mar | 15 |
但是我现在需要那些月份作为列名.不知道从这里去哪里.
But I now need those months as column names. Not sure where to go from here.
- 就语言而言,此查询将在postgres中运行.
- 以上月份只是示例,显然真实数据集要大得多,涵盖了数千个ID的所有12个月
任何来自SQL专家的想法都非常感谢!
Any ideas from an SQL guru very much appreciated!
推荐答案
tablefunc 模块
我会为此使用crosstab()
.如果还没有,请安装附加模块tablefunc
:
tablefunc module
I would use crosstab()
for this. Install the additional module tablefunc
if you don't have already:
CREATE EXTENSION tablefunc
此处的基本知识:
PostgreSQL交叉表查询
Basics here:
PostgreSQL Crosstab Query
如何处理多余的列:
使用Tablefunc在多个列上进行透视
How to deal with extra columns:
Pivot on Multiple Columns using Tablefunc
高级用法:
使用CASE和GROUP BY进行旋转的动态替代方案
Advanced usage:
Dynamic alternative to pivot with CASE and GROUP BY
CREATE TEMP TABLE tbl
(id int, extra_info varchar(3), month date, value int);
INSERT INTO tbl (id, extra_info, month, value)
VALUES
(1, 'abc', '2012-01-01', 10),
(1, 'abc', '2012-02-01', 20),
(2, 'def', '2012-01-01', 10),
(2, 'def', '2012-02-01', 5),
(1, 'abc', '2012-01-01', 15),
(3, 'ghi', '2012-03-01', 15);
我在基表中使用了实际的date
,因为我假设只是为了简化您的问题而将其隐藏.但是只有月份名称,ORDER BY
就没什么了.
I am using an actual date
in the base table, since I am assuming are just hiding that in a effort to simplify your question. But with just month names, there would be nothing to ORDER BY
.
SELECT * FROM crosstab(
$$SELECT id, extra_info, to_char(month, 'mon'), sum(value) AS value
FROM tbl
GROUP BY 1,2,month
ORDER BY 1,2,month$$
,$$VALUES
('jan'::text), ('feb'), ('mar'), ('apr'), ('may'), ('jun')
, ('jul'), ('aug'), ('sep'), ('oct'), ('nov'), ('dec')$$
)
AS ct (id int, extra text
, jan int, feb int, mar int, apr int, may int, jun int
, jul int, aug int, sep int, oct int, nov int, dec int);
结果:
id | extra | jan | feb | mar | apr | may | jun | jul | aug | sep | oct | nov | dec
----+-------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----
1 | abc | 25 | 20 | | | | | | | | | |
2 | def | 10 | 5 | | | | | | | | | |
3 | ghi | | | 15 | | | | | | | | |
安装tablefunc模块需要一些开销和一些学习,但是生成的查询要快得多,短得多并且用途更多.
Installing the tablefunc module requires some overhead and some learning, but the resulting queries are much faster and shorter and more versatile.
这篇关于按月求和,然后将月作为列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!