问题描述
groupwordcount:{p> 我学习Hadoop pig,并且始终坚持引用这些elements.please找到下面的示例。 group:chararray,words:{(bag_of_tokenTuples_from_line :: token:chararray)}}
有人可以解释一下如果我们有嵌套的元组和袋子,如何引用这些元素。
为了更好地理解嵌套的引用,任何链接都会很有帮助。
让我们做一个简单的演示来理解这个问题。
表示存储在HDFS中'/tmp/a.txt'文件夹中的文件'a.txt'
A = LOAD'/tmp/a.txt'使用PigStorage(',')AS(名称:chararray,term:chararray,gpa:float);
转储A
(John,fl,3.9)
(John,fl,3.7)
(John,sp,4.0)
$ b
(Mary,fl,3.8)
(Mary,fl,3.9) (Mary,sp,4.0)
现在让我们以这个别名'A'为基础,根据一些参数say名字和术语来分组。
$ b
B = GROUP BY BY(名称,术语);
转储B;
$ b
(John,fl),{(John,fl,3.7),(John,fl,3.9)})
((Mary,sm), {(Mary,sm,4.0)})
((Mary,sp),{(Mary,sp,4.0)})
描述B;
B:{group:(name:chararray,term:chararray),A:{(name:chararray,term:chararray,gpa:float)}}现在它已经成为你所问的问题陈述。让我演示如何访问组元组或tuple元素或两者兼有。
C = foreach B生成group.name,group.term,A .name,A.term,A.gpa;
dump C;
(John,fl,{(John),(John)},{(fl),(fl)},{(3.7),(3.9)})
$ b $(John,sm,{(John)},{(sm)},{(3.8)})
$ b $(John,sp,{( (Mary),(Mary)},{(fl)},{(sp)},{(4.0)})
,((fl)},{(3.9),(3.8)})
(Mary,sm,{(Mary)},{(sm)},{ 4.0)})
(Mary,sp,{(Mary)},{(sp)},{(4.0)})
所以我们通过这种方式访问了所有元素。
希望这有助于
I am learning Hadoop pig and I always stuck at referencing the elements.please find the below example.
groupwordcount: {group: chararray,words: {(bag_of_tokenTuples_from_line::token: chararray)}}
Can somebody please explain how to reference the elements if we have nested tuples and bags.
Any Links for better understanding the nested referrencing would be great help.
Let's do a simple Demonstration to understand this problem.
say a file 'a.txt' stored at '/tmp/a.txt' folder in HDFS
A = LOAD '/tmp/a.txt' using PigStorage(',') AS (name:chararray,term:chararray,gpa:float);
Dump A;
(John,fl,3.9)
(John,fl,3.7)
(John,sp,4.0)
(John,sm,3.8)
(Mary,fl,3.8)
(Mary,fl,3.9)
(Mary,sp,4.0)
(Mary,sm,4.0)
Now let's group by this Alias 'A' on the basis of some parameter say name and term
B = GROUP A BY (name,term);
dump B;
((John,fl),{(John,fl,3.7),(John,fl,3.9)})
((John,sm),{(John,sm,3.8)})
((John,sp),{(John,sp,4.0)})
((Mary,fl),{(Mary,fl,3.9),(Mary,fl,3.8)})
((Mary,sm),{(Mary,sm,4.0)})
((Mary,sp),{(Mary,sp,4.0)})
describe B;
B: {group: (name: chararray,term: chararray),A: {(name: chararray,term: chararray,gpa: float)}}
now it has become the problem statement that you have asked. Let me demonstrate you how to access elements of group tuple or element of A tuple or both
C = foreach B generate group.name,group.term,A.name,A.term,A.gpa;
dump C;
(John,fl,{(John),(John)},{(fl),(fl)},{(3.7),(3.9)})
(John,sm,{(John)},{(sm)},{(3.8)})
(John,sp,{(John)},{(sp)},{(4.0)})
(Mary,fl,{(Mary),(Mary)},{(fl),(fl)},{(3.9),(3.8)})
(Mary,sm,{(Mary)},{(sm)},{(4.0)})
(Mary,sp,{(Mary)},{(sp)},{(4.0)})
So we accessed all elements by this way.
hope this helped
这篇关于猪参考的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!