问题描述
我是Athena的新手,我试图了解如何将多列从长格式转换为宽格式.似乎需要 presto
,但是我只能成功地将 map_agg
应用于一个变量.我认为使用 multimap_agg
可以达到我的最终结果,但不能完全起作用.
I am new to Athena and I am trying to understand how to turn multiple columns from long to wide format. It seems like presto
is what is needed, but I've only successfully been able to apply map_agg
to one variable. I think my below final outcome can be achieved with multimap_agg
but cannot quite get it to work.
下面,我逐步介绍我的步骤和数据.如果您有任何建议或问题,请告诉我!
Below I walk through my steps and data. If you have some suggestions or questions, please let me know!
首先,数据开始如下:
id | letter | number | value
------------------------------------
123 | a | 1 | 62
123 | a | 2 | 38
123 | a | 3 | 44
123 | b | 1 | 74
123 | b | 2 | 91
123 | b | 3 | 97
123 | c | 1 | 38
123 | c | 2 | 98
123 | c | 3 | 22
456 | a | 1 | 99
456 | a | 2 | 33
456 | a | 3 | 81
456 | b | 1 | 34
456 | b | 2 | 79
456 | b | 3 | 43
456 | c | 1 | 86
456 | c | 2 | 60
456 | c | 3 | 59
然后使用 where
子句然后进行 joining
:
Then I transform the data into the below using filtering with the where
clause and then joining
:
id | letter | 1 | 2 | 3
----------------------------
123 | a | 62 | 38 | 44
123 | b | 74 | 91 | 97
123 | c | 38 | 98 | 22
456 | a | 99 | 33 | 81
456 | b | 34 | 79 | 43
456 | c | 86 | 60 | 59
对于最终结果,我想将其转换为以下内容:
For the final outcome, I would like to transform it into the below:
id | a_1 | a_2 | a_3 | b_1 | b_2 | b_3 | c_1 | c_2 | c_3
--------------------------------------------------------------------------
123 | 62 | 38 | 44 | 74 | 91 | 97 | 38 | 98 | 22
456 | 99 | 33 | 81 | 34 | 79 | 43 | 86 | 60 | 59
推荐答案
您可以使用窗口函数和条件聚合.这要求您事先知道可能的字母,以及每个id/字母元组的最大行数:
You can use window functions and conditional aggregation. This requires that you know in advance the possible letters, and the maximum rows per id/letter tuple:
select
id,
max(case when letter = 'a' and rn = 1 then value end) a_1,
max(case when letter = 'a' and rn = 2 then value end) a_2,
max(case when letter = 'a' and rn = 3 then value end) a_3,
max(case when letter = 'b' and rn = 1 then value end) b_1,
max(case when letter = 'b' and rn = 2 then value end) b_2,
max(case when letter = 'b' and rn = 3 then value end) b_3,
max(case when letter = 'c' and rn = 1 then value end) c_1,
max(case when letter = 'c' and rn = 2 then value end) c_2,
max(case when letter = 'c' and rn = 3 then value end) c_3
from (
select
t.*,
row_number() over(partition by id, letter order by number) rn
from mytable t
) t
group by id
实际上,如果 number
始终为 1
, 2
, 3
,那么您就不必甚至需要窗口功能:
Actually, if the number
s are always 1
, 2
, 3
, then you don't even need the window function:
select
id,
max(case when letter = 'a' and number = 1 then value end) a_1,
max(case when letter = 'a' and number = 2 then value end) a_2,
max(case when letter = 'a' and number = 3 then value end) a_3,
max(case when letter = 'b' and number = 1 then value end) b_1,
max(case when letter = 'b' and number = 2 then value end) b_2,
max(case when letter = 'b' and number = 3 then value end) b_3,
max(case when letter = 'c' and number = 1 then value end) c_1,
max(case when letter = 'c' and number = 2 then value end) c_2,
max(case when letter = 'c' and number = 3 then value end) c_3
from mytable t
group by id
这篇关于雅典娜presto-多列从长到宽的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!