本文介绍了雅典娜presto-多列从长到宽的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是Athena的新手,我试图了解如何将多列从长格式转换为宽格式.似乎需要 presto ,但是我只能成功地将 map_agg 应用于一个变量.我认为使用 multimap_agg 可以达到我的最终结果,但不能完全起作用.

I am new to Athena and I am trying to understand how to turn multiple columns from long to wide format. It seems like presto is what is needed, but I've only successfully been able to apply map_agg to one variable. I think my below final outcome can be achieved with multimap_agg but cannot quite get it to work.

下面,我逐步介绍我的步骤和数据.如果您有任何建议或问题,请告诉我!

Below I walk through my steps and data. If you have some suggestions or questions, please let me know!

首先,数据开始如下:

id  | letter    | number   | value
------------------------------------
123 | a         | 1        | 62
123 | a         | 2        | 38
123 | a         | 3        | 44
123 | b         | 1        | 74
123 | b         | 2        | 91
123 | b         | 3        | 97
123 | c         | 1        | 38
123 | c         | 2        | 98
123 | c         | 3        | 22
456 | a         | 1        | 99
456 | a         | 2        | 33
456 | a         | 3        | 81
456 | b         | 1        | 34
456 | b         | 2        | 79
456 | b         | 3        | 43
456 | c         | 1        | 86
456 | c         | 2        | 60
456 | c         | 3        | 59

然后使用 where 子句然后进行 joining :

Then I transform the data into the below using filtering with the where clause and then joining:

id  | letter  | 1  | 2  | 3
----------------------------
123 | a       | 62 | 38 | 44
123 | b       | 74 | 91 | 97
123 | c       | 38 | 98 | 22
456 | a       | 99 | 33 | 81
456 | b       | 34 | 79 | 43
456 | c       | 86 | 60 | 59

对于最终结果,我想将其转换为以下内容:

For the final outcome, I would like to transform it into the below:

id  | a_1   | a_2   | a_3   | b_1   | b_2   | b_3   | c_1   | c_2   | c_3
--------------------------------------------------------------------------
123 | 62    | 38    | 44    | 74    | 91    | 97    | 38    | 98    | 22
456 | 99    | 33    | 81    | 34    | 79    | 43    | 86    | 60    | 59

推荐答案

您可以使用窗口函数和条件聚合.这要求您事先知道可能的字母,以及每个id/字母元组的最大行数:

You can use window functions and conditional aggregation. This requires that you know in advance the possible letters, and the maximum rows per id/letter tuple:

select
    id,
    max(case when letter = 'a' and rn = 1 then value end) a_1,
    max(case when letter = 'a' and rn = 2 then value end) a_2,
    max(case when letter = 'a' and rn = 3 then value end) a_3,
    max(case when letter = 'b' and rn = 1 then value end) b_1,
    max(case when letter = 'b' and rn = 2 then value end) b_2,
    max(case when letter = 'b' and rn = 3 then value end) b_3,
    max(case when letter = 'c' and rn = 1 then value end) c_1,
    max(case when letter = 'c' and rn = 2 then value end) c_2,
    max(case when letter = 'c' and rn = 3 then value end) c_3
from (
    select
        t.*,
        row_number() over(partition by id, letter order by number) rn
    from mytable t
) t
group by id

实际上,如果 number 始终为 1 2 3 ,那么您就不必甚至需要窗口功能:

Actually, if the numbers are always 1, 2, 3, then you don't even need the window function:

select
    id,
    max(case when letter = 'a' and number = 1 then value end) a_1,
    max(case when letter = 'a' and number = 2 then value end) a_2,
    max(case when letter = 'a' and number = 3 then value end) a_3,
    max(case when letter = 'b' and number = 1 then value end) b_1,
    max(case when letter = 'b' and number = 2 then value end) b_2,
    max(case when letter = 'b' and number = 3 then value end) b_3,
    max(case when letter = 'c' and number = 1 then value end) c_1,
    max(case when letter = 'c' and number = 2 then value end) c_2,
    max(case when letter = 'c' and number = 3 then value end) c_3
from mytable t
group by id

这篇关于雅典娜presto-多列从长到宽的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-25 02:48