问题描述
在提供的以下XML中,我需要将{Name,Value}对与ParentID标签一起爆炸,并适当地映射它们,"Parent"全家福:
In the below XML provided, I need to explode {Name, Value} pairs along with ParentID tags and map them appropriately, "Parent" familywise:
<Parents>
<Parent>
<ParentID>12345</ParentID>
<ParentArray>
<ParentField>
<Name>ABCD</Name>
<Value>111</Value>
</ParentField>
</ParentArray>
</Parent>
<Parent>
<ParentID>54321</ParentID>
<ParentArray>
<ParentField>
<Name>ABCD</Name>
<Value>111</Value>
</ParentField>
<ParentField>
<Name>CDBA</Name>
<Value>222</Value>
</ParentField>
</ParentArray>
</Parent>
<Parent>
<ParentID>12534</ParentID>
<ParentArray>
<ParentField>
<Name>ABCD</Name>
<Value>111</Value>
</ParentField>
<ParentField>
<Name>ABCD</Name>
<Value>222</Value>
</ParentField>
<ParentField>
<Name>CDBA</Name>
<Value>333</Value>
</ParentField>
</ParentArray>
</Parent>
<Parent>
<ParentID>51342</ParentID>
<ParentArray>
<ParentField>
<Name>ABCD</Name>
<Value>111</Value>
</ParentField>
<ParentField>
<Name>ABCD</Name>
<Value>222</Value>
</ParentField>
<ParentField>
<Name>ABCD</Name>
<Value>333</Value>
</ParentField>
<ParentField>
<Name>CDBA</Name>
<Value>444</Value>
</ParentField>
</ParentArray>
</Parent>
</Parents>
预期输出:
ParentID Name Value
12345 ABCD 111
54321 ABCD 111
54321 CDBA 222
12534 ABCD 111
12534 ABCD 222
12534 CDBA 333
51342 ABCD 111
51342 ABCD 222
51342 ABCD 333
51342 CDBA 444
在每个父母家庭中,都有一个ParentID标签.同样,在ParentArray子家族中,存在多个{Name,Value}对的ParentField子家族.需要在每个父级家庭中使用其{名称,值}对正确映射ParentID.
In every Parent family, there's a ParentID tag. Also in ParentArray subfamily, there are multiple ParentField subfamilies with {Name, Value} pairs present. Need to get ParentID's mapped correctly with their {Name, Value} pairs, within each Parent family.
推荐答案
按数组中的值和位置组合XPATH过滤.查看代码中的注释:
Combine XPATH filtering by values and positions in arrays. See comments in the code:
with your_data as (
select '<Parents>
<Parent>
<ParentID>12345</ParentID>
<ParentArray>
<ParentField>
<Name>ABCD</Name>
<Value>111</Value>
</ParentField>
</ParentArray>
</Parent>
<Parent>
<ParentID>54321</ParentID>
<ParentArray>
<ParentField>
<Name>ABCD</Name>
<Value>111</Value>
</ParentField>
<ParentField>
<Name>CDBA</Name>
<Value>222</Value>
</ParentField>
</ParentArray>
</Parent>
<Parent>
<ParentID>12534</ParentID>
<ParentArray>
<ParentField>
<Name>ABCD</Name>
<Value>111</Value>
</ParentField>
<ParentField>
<Name>ABCD</Name>
<Value>222</Value>
</ParentField>
<ParentField>
<Name>CDBA</Name>
<Value>333</Value>
</ParentField>
</ParentArray>
</Parent>
<Parent>
<ParentID>51342</ParentID>
<ParentArray>
<ParentField>
<Name>ABCD</Name>
<Value>111</Value>
</ParentField>
<ParentField>
<Name>ABCD</Name>
<Value>222</Value>
</ParentField>
<ParentField>
<Name>ABCD</Name>
<Value>333</Value>
</ParentField>
<ParentField>
<Name>CDBA</Name>
<Value>444</Value>
</ParentField>
</ParentArray>
</Parent>
</Parents>
' as xmlinfo
)
select p.parentid, n.name, -- n.pos+1,
--filter by parentid, name and position and extract scalar
XPATH_STRING(xmlinfo,concat('(((Parents/Parent)[ParentID="',p.parentid,'"])/ParentArray/ParentField[',n.pos+1,'])[Name="',n.name,'"]/Value/text()')) as value
from your_data d
lateral view explode(XPATH(xmlinfo, 'Parents/Parent/ParentID/text()')) p as parentid
--filer by parentID to get array of Name with position inside ParentArray
lateral view posexplode(XPATH(xmlinfo, concat('(Parents/Parent)[ParentID="',p.parentid,'"]/ParentArray/ParentField/Name/text()'))) n as pos, name
;
结果:
p.parentid n.name value
12345 ABCD 111
54321 ABCD 111
54321 CDBA 222
12534 ABCD 111
12534 ABCD 222
12534 CDBA 333
51342 ABCD 111
51342 ABCD 222
51342 ABCD 333
51342 CDBA 444
这篇关于在Hive中,如何在子家族&中分解XML中的标签.外亚科,并适当地对它们进行家庭映射?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!