如何为每种事件在这里计算多少1和0?我在 pig 做所有这一切,第二个字段中只有1和0。
数据如下所示:
(pageLoad,1)
(pageLoad,0)
(pageLoad,1)
(appLaunch,1)
(appLaunch,0)
(otherEvent,1)
(otherEvent,0)
(event,1)
(event,1)
(event,0)
(somethingelse,0)
输出将是这样的
pageLoad 1:234 0:2359
appLaunch 1:54 0:111
event 1:345 0:0
要么
type 1 0
pageLoad 21 345
appLaunch 0 123
event 234 12
感谢大家。
最佳答案
输入:
pageLoad,1
pageLoad,0
pageLoad,1
appLaunch,1
appLaunch,0
otherEvent,1
otherEvent,0
event,1
event,1
event,0
somethingelse,0
pig 脚本:
A = LOAD 'input.csv' USING PigStorage(',') AS (event_type:chararray,status:int);
B = GROUP A BY event_type;
req = FOREACH B {
event_type_1 = FILTER A BY status==1;
event_type_0 = FILTER A BY status==0;
GENERATE group AS event_type, COUNT(event_type_1) AS event_type_1_count, COUNT(event_type_0) AS event_type_0_count;
};
DUMP req;
输出:
(event,2,1)
(pageLoad,2,1)
(appLaunch,1,1)
(otherEvent,1,1)
(somethingelse,0,1)