hive官方函数解释
hive官网函数大全地址: hive官网函数大全地址
示例
1、array(value1, value2, …)
创建一个 Array 对象,由一组值组成。
SELECT array(85, 90, 78, 92, 86) AS scores;
---结果
scores
[85,90,78,92,86]
2、array_contains(Array, value)
判断一个值是否在 Array 对象中存在。
SELECT array_contains(array(85, 90, 78, 92, 86), 60) AS is_passed;
---结果
is_passed
false
SELECT if(array_contains(array(85, 90, 78, 92, 86), 60),1,0) AS is_passed;
---结果
is_passed
0
3、sort_array(Array)
sort_array 函数可以用于对 Array 对象中的元素进行排序。具体来说,sort_array 函数会将 Array 对象中的元素按照升序顺序进行排序,并返回一个新的排序后的 Array 对象。
SELECT sort_array(array(85, 90, 78, 92, 86)) AS sorted_scores;
---结果
sorted_scores
[78,85,86,90,92]
4、explode (array)
explode 函数可以用于将一个 Array 对象拆分成多行。具体来说,explode 函数会将 Array 对象中的每个元素拆分成一行,并与原始数据集中的其他字段一起返回。
select explode(array('A','B','C'));
select explode(array('A','B','C')) as col;
select tf.* from (select 0) t lateral view explode(array('A','B','C')) tf;
select tf.* from (select 0) t lateral view explode(array('A','B','C')) tf as col;
---上述四个结果均为
col
A
B
C
5、posexplode (array)
select posexplode(array('A','B','C'));
select posexplode(array('A','B','C')) as (pos,val);
select tf.* from (select 0) t lateral view posexplode(array('A','B','C')) tf;
select tf.* from (select 0) t lateral view posexplode(array('A','B','C')) tf as pos,val;
---上述四个结果均为
pos col
0 A
1 B
2 C
实战
下面全年级的学生成绩单的部分,包含学生姓名,课程,分数,年级排名,要求将每人的成绩进行汇总整理:按排名,课程和成绩的顺序
数据:
'小明' ,'语文' ,80 as score , 100
'小明' ,'数学' ,90 as score, 85
'小明' ,'英语' ,75 as score, 203
'小花' ,'语文' ,85 as score , 90
'小花' ,'数学' ,65 as score , 350
'小花' ,'英语' ,90 as score , 20
代码实现:
with students as(
select '小明' as name,'语文' as course ,80 as score , 100 as rk
union all
select '小明' as name,'数学' as course ,90 as score, 85 as rk
union all
select '小明' as name,'英语' as course ,75 as score, 203 as rk
union all
select '小花' as name,'语文' as course ,85 as score , 90 as rk
union all
select '小花' as name,'数学' as course ,65 as score , 350 as rk
union all
select '小花' as name,'英语' as course ,90 as score , 20 as rk
)
SELECT name,
concat_ws(',',
sort_array(
collect_list(
concat_ws(':',
lpad(cast(rk as string), 4, '0'),
course,
cast(score as string)
)
)
)
) AS sorted_scores
FROM students
GROUP BY name;
---结果
name sorted_scores
小明 0085:数学:90,0100:语文:80,0203:英语:75
小花 0020:英语:90,0090:语文:85,0350:数学:65