样本数据

DATE      WindDirection

1/1/2000  SW
1/2/2000  SW
1/3/2000  SW
1/4/2000  NW
1/5/2000  NW

以下问题

每天都是不规律的,风向也不是唯一的,所以现在我们正在尝试获取最常见风向的COUNT个

我的查询是
weather_data = FOREACH Weather GENERATE $16 AS Date, $9 AS w_direction;
e = FOREACH weather_data
            {
                unique_winds = DISTINCT weather_data.w_direction;
                GENERATE unique_winds, COUNT(unique_winds);
            }
dump e;

逻辑是找到DISTINCT WindDirections(大约有7个),然后按WindDirection分组并应用计数。

现在,我想获得风的总数或风向数。

最佳答案

您将必须按风向分组并获得计数。按desc顺序对计数进行排序并获得最上面的行。

wd = FOREACH Weather GENERATE $9 AS w_direction;
gwd = GROUP wd BY w_direction;
cwd = FOREACH gwd GENERATE group as wd,COUNT(wd.$0);
owd = ORDER cwd BY $1 DESC;
mwd  = LIMIT owd 1;
DUMP mwd;

关于hadoop - pig -获得最大数量,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/36753093/

10-16 01:40