我的表架构是(状态字符串,城市字符串,大小整数)
这是我的输入数据
Karnataka,Bangalore,200
Karnataka,Mysore,50
Karnataka,Bellary,100
Karnataka,Mangalore,10
Andhra pradesh,Chittoor, 25
Andhra pradesh,nellore, 15
Andhra pradesh,guntur, 20
Andhra pradesh,tirupathi, 30
Andhra pradesh,vizag, 35
Andhra pradesh,kadapa, 45
我想检索州的前2个城市及其大小,我想要以下输出。
(Andhra pradesh,{(35),(30)},{(vizag),(tirupathi)})
(Karnataka,{(200),(100)},{(Bangalore),(Bellary)})
我已经编写了如下查询,但是由于表达式不在按大小分组中,因此出现错误,请帮我。
select * from statefile groyp by state,city order by size limit 2;
提前致谢。
最佳答案
您将使用row_number()
:
select sf.*
from (select sf.*,
row_number() over (partition by state order by size desc) as seqnum
from statefile sf
) sf
where seqnum <= 2;
关于hadoop - 配置单元表达式不在按 key 大小分组,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/38625793/