本文介绍了将多行转换为多行(如R中的重塑)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个表格,其中包含每个产品的数据,他们都有某些属性,比如说(现在仅限于A,B,C)。我有与每个属性相对应的值。

 产品属性值
p1 A 1
p1 B 2
p1 C 3
p2 A 1
p3 B 3
p3 C 2

而我希望我的输出是:

 产品ABC 
1 1 2 3
2 1 NULL NULL
3 NULL 3 2

到目前为止我所尝试的是它确实给了我想要的东西最后,但这个查询完全是一个大规模,我来自R背景,我想知道他们是一个简单的解决方案或像 cast 从<$ c








$ b $ code>选择
product,
max(`A`)为A,
max(`B`)为B,
max(`C`)为C
from
(当'A'时,选择
产品,
案例属性,然后`value`否则null结束为A,
案例属性当'B'then'value `else null结束为B,
case attribute when'C 'then`value` else null end as C
from test)t
group by product


解决方案

我不知道Hive中的 pivot 函数,但这是可能的。您需要将属性和值收集到地图,可以使用



示例:

 添加JAR /root/path/to/jar/brickhouse-0.7.0.jar; 
CREATE TEMPORARY FUNCTION收集AS'brickhouse.udf.collect.CollectUDAF';

选择产品
,attr_map ['A'] AS A
,attr_map ['B'] AS B
,attr_map ['C'] AS C
FROM(
SELECT product
,collect(属性,值)AS attr_map
FROM test
GROUP BY产品
)x

这里需要注意的是,如果你有很多属性,这可能是相当多的重复代码。


I have a table that contains the data that for each product, they have certain attributes, say (limited to A,B,C for now). and I have the value corresponding to each attribute.

Product Attribute Value
p1  A  1
p1  B  2
p1  C  3
p2  A  1
p3  B  3
p3  C  2

And I want my output to be:

Product    A     B    C
1          1     2    3
2          1     NULL NULL
3          NULL  3    2

What I have tried so far is and it did gave me what I want in the end, but this query is totally a big mass, I came from R background and I am wondering if their is an easy solution or something similar like cast from reshape in R.

Thanks!

select
    product, 
    max(`A`) as A, 
    max(`B`) as B, 
    max(`C`) as C 
from
(select 
    product,
    case attribute when 'A' then `value` else null end as A,
    case attribute when 'B' then `value` else null end as B,
    case attribute when 'C' then `value` else null end as C
from test) t
group by product
解决方案

I'm not aware of a pivot function in Hive but this is possible. You will need to collect attribute and value to a map, which can be done using this UDAF

Example:

ADD JAR /root/path/to/jar/brickhouse-0.7.0.jar;
CREATE TEMPORARY FUNCTION collect AS 'brickhouse.udf.collect.CollectUDAF';

SELECT product
    , attr_map['A'] AS A
    , attr_map['B'] AS B
    , attr_map['C'] AS C
FROM (
    SELECT product
        , collect(attribute, value) AS attr_map
    FROM test
    GROUP BY product
    ) x

The caveat here is that if you have a lot of attributes, this can be quite a bit of repetitive code.

这篇关于将多行转换为多行(如R中的重塑)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-21 15:00