我有以下Pig元组的输出:
dump g:
()
(97)
(245)
(870)
(480)
describe g:
g: {long}
我希望总结以上#的总数,因此我尝试了以下操作:
h = foreach g generate SUM($0);
我收到此错误:
Please use an explicit cast.
然后,我尝试将值转换为(int),但仍然无法正常工作。
我正在寻找的输出是这样的:
1692
这是导致的代码:
a = LOAD 'tellers' using TextLoader() AS line;
# convert a to charrarry
b = foreach a generate (chararray)line;
# run through my UDF to create tuples
c = foreach b generate myudfs.TellerParser5(line); # ({(20),(5),(5),(10)(1),(1),(1),(1),(1),(5),(10),(10),(10)})....
d = foreach c generate flatten(number);
e = group d by number; #{group: chararray,d: {(number: chararray)}}
f = foreach e generate group, COUNT(d); # f: {group: chararray,long}
g = foreach f generate (long)$0 * $1;
最佳答案
您将需要执行以下操作:
H = GROUP G ALL;
I = FOREACH H GENERATE SUM(G.$0);
关于hadoop - pig 元组中的求和值,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/33219418/