问题描述
我在Mapper类中创建了一些计数器:
$ b
(使用appengine-mapreduce Java库v.0.5编写的示例)
@Override
public void map(Entity entity){
getContext()。incrementCounter(analyze);
if(isSpecial(entity)){
getContext()。incrementCounter(special);
$ b $ isSpecial
取决于实体的状态,与该问题无关,只返回 true
或 false
)
当我完成处理整个东西时,我想在输出的 finish
方法中访问这些计数器class:
@Override
public总结(Collection< ;? extends OutputWriter< Entity>> writers){
//获取计数器并保存/返回摘要
int analyze = 0; // getCounter( 分析);
int special = 0; // getCounter( 特殊);
摘要摘要=新摘要(已分析,特殊);
save(summary);
返回汇总;
}
...但方法 getCounter
只能从类,它只能从Mappers / Reducers getContext()
方法访问。
附注:我无法将计数器值发送到输出类,因为整个Map / Reduce关于将一组实体转换为另一组(换言之:计数器不是Map / Reduce的主要目的)。这些计数器只是为了控制 - 这是有道理的,我在这里计算它们,而不是创建另一个进程来计算。
谢谢。
解决方案今天在输出内部没有办法做到这一点。但请随时在此处申请:
然而,你可以做的是链接一个作业,在你的map-reduce之后运行接收它的输出和计数器。这里有一个例子:
在上面的例子中,它连续运行3个MapReduce作业。请注意,这些不必是MapReduce作业,您可以创建自己的类来扩展Job,并具有创建Summary对象的run方法。
I have some counters I created at my Mapper class:
(example written using the appengine-mapreduce Java library v.0.5)
@Override
public void map(Entity entity) {
getContext().incrementCounter("analyzed");
if (isSpecial(entity)){
getContext().incrementCounter("special");
}
}
(The method isSpecial
just returns true
or false
depending on the state of the entity, not relevant to the question)
I want to access those counters when I finish processing the whole stuff, at the finish
method of the Output class:
@Override
public Summary finish(Collection<? extends OutputWriter<Entity>> writers) {
//get the counters and save/return the summary
int analyzed = 0; //getCounter("analyzed");
int special = 0; //getCounter("special");
Summary summary = new Summary(analyzed, special);
save(summary);
return summary;
}
... but the method getCounter
is only available from the MapperContext class, which is accessible only from Mappers/Reducers getContext()
method.
How can I access my counters at the Output stage?
Side note: I can't send the counters values to my outputted class because the whole Map/Reduce is about transforming a set of Entities to another set (in other words: the counters are not the main purpose of the Map/Reduce). The counters are just for control - it makes sense I compute them here instead of creating another process just to make the counts.
Thanks.
解决方案 There is not a way to do this inside of output today. But feel free to request it here:https://code.google.com/p/appengine-mapreduce/issues/list
What you can do however is to chain a job to run after your map-reduce that will receive it's output and counters. There is an example of this here:https://code.google.com/p/appengine-mapreduce/source/browse/trunk/java/example/src/com/google/appengine/demos/mapreduce/entitycount/ChainedMapReduceJob.java
In the above example it is running 3 MapReduce jobs in a row. Note that these don't have to be MapReduce jobs, you can create your own class that extends Job and has a run method which creates your Summary object.
这篇关于如何访问输出阶段的Mapper / Reducer计数器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!