我正在尝试做一个简单的Pig查询,我需要在其中查找ID为178的电影的平均收视率。我尝试了以下几种版本,并且过滤器正在运行,但AVG功能却不起作用。有人可以建议吗?谢谢

a = load '/user/pig/u.data' AS (userid:int, movieid:int, rating:double, timestamp:chararray);
b = FOREACH a GENERATE AVG(rating) as rate, movieid;
c = group b by rate;
d= filter a by movieid==178;
dump d;

最佳答案

您应该按movieid分组

b = FILTER a BY (movieid == 178);
c = GROUP b BY movied;
d = FOREACH c GENERATE group AS movieid,AVG(a.rating) as rate;

关于hadoop - Apache Pig AVG函数,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/44333500/

10-11 16:35