我的表架构是(状态字符串,城市字符串,大小整数)

这是我的输入数据

Karnataka,Bangalore,200
Karnataka,Mysore,50
Karnataka,Bellary,100
Karnataka,Mangalore,10
Andhra pradesh,Chittoor, 25
Andhra pradesh,nellore, 15
Andhra pradesh,guntur, 20
Andhra pradesh,tirupathi, 30
Andhra pradesh,vizag, 35
Andhra pradesh,kadapa, 45

我想检索州的前2个城市及其大小,我想要以下输出。
(Andhra pradesh,{(35),(30)},{(vizag),(tirupathi)})
(Karnataka,{(200),(100)},{(Bangalore),(Bellary)})

我已经编写了如下查询,但是由于表达式不在按大小分组中,因此出现错误,请帮我。
select * from statefile groyp by state,city order by size limit 2;

提前致谢。

最佳答案

您将使用row_number():

select sf.*
from (select sf.*,
             row_number() over (partition by state order by size desc) as seqnum
      from statefile sf
     ) sf
where seqnum <= 2;

关于hadoop - 配置单元表达式不在按 key 大小分组,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/38625793/

10-11 06:22