本文介绍了Hive - 将行分组到地图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这样的表

Col1   Col2   Col3
A      1      word1
A      2      word2
A      3      word3
A      4      word4
B      1      word1
B      3      word3

我想通过col1将col2和col3分组,但将col2和col3保存在地图中,如下所示:

And I want to group col2 and col3 by col1, but keep col2 and col3 in a map, like so:

Col1   map(col2, col3)
A      [(1, word1), (2, word2), (3, word3), (4, word4)]
B      [(1, word1), (3, word3)]

我知道有一种办法,只需一个阵列,如下所示:

I know there is a way to do this with just an array, as appears here: Grouping hive rows in an array of this rows

但我想知道是否可以使用地图(键/值对)。

But I'm wondering if this is possible with a map (key/value pairs).

推荐答案

在BrickHouse中使用收集UDF

Use the "collect" UDF in BrickHouse http://github.com/klout/brickhouse

select col1, collect( col2, col3 )
from mytable
group by col1

您还可以将地图与union_mapUDAF

You can also merge maps with the "union_map" UDAF

这篇关于Hive - 将行分组到地图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-18 21:15