我有以下Pig元组的输出:

 dump g:
 ()
 (97)
 (245)
 (870)
 (480)

 describe g:
 g: {long}

我希望总结以上#的总数,因此我尝试了以下操作:
 h = foreach g generate SUM($0);

我收到此错误:
 Please use an explicit cast.

然后,我尝试将值转换为(int),但仍然无法正常工作。

我正在寻找的输出是这样的:
 1692

这是导致的代码:
 a = LOAD 'tellers' using TextLoader() AS line;
 # convert a to charrarry
 b = foreach a generate (chararray)line;
 # run through my UDF to create tuples
 c = foreach b generate myudfs.TellerParser5(line);  # ({(20),(5),(5),(10)(1),(1),(1),(1),(1),(5),(10),(10),(10)})....
 d = foreach c generate flatten(number);
 e = group d by number; #{group: chararray,d: {(number: chararray)}}
 f = foreach e generate group, COUNT(d);  # f: {group: chararray,long}
 g = foreach f generate (long)$0 * $1;

最佳答案

您将需要执行以下操作:

H = GROUP G ALL;
I = FOREACH H GENERATE SUM(G.$0);

关于hadoop - pig 元组中的求和值,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/33219418/

10-12 23:33