无法通过通用选项解析器设置mapreduce

无法通过通用选项解析器设置mapreduce

本文介绍了无法通过通用选项解析器设置mapreduce.job.reduces的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  hadoop jar MapReduceTryouts-1.jar invertedindex.simple.MyDriver -D mapreduce.job.reduces = 10 / user / notprabhu2 / Input / potter / / user / notprabhu2 / output 

我一直试图通过GenericOptionParser提供的-D选项来设置reducer的数量,但它确实我不知道为什么。



我试过 -D mapreduce.job.reduces = 10 (在-D之后有空格),还有

-Dmapreduce.job.reduces = 10 D),但似乎没有闪避。

在我的Driver类中,我实现了Tools。

  package invertedindex.simple; 

导入org.apache.hadoop.conf.Configuration;
导入org.apache.hadoop.conf.Configured;
导入org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

public class MyDriver extends Configured implements工具{

@Override
public int run(String [] args)throws Exception {

配置conf = getConf();
Job job = Job.getInstance(conf);

job.setJarByClass(MyDriver.class);

Path outputPath = new Path(args [1]);
outputPath.getFileSystem(getConf())。delete(outputPath,true);

job.setMapperClass(MyMapper.class);
job.setReducerClass(MyReducer.class);

job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);

TextInputFormat.addInputPath(job,new Path(args [0]));
TextOutputFormat.setOutputPath(job,outputPath);

job.setNumReduceTasks(3);

job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
返回job.waitForCompletion(true)? 0:1;


$ b public static void main(String [] args)throws Exception {
int exitCode = ToolRunner.run(new Configuration(),new MyDriver() ,args);
System.exit(exitCode);
}

}

由于我已明确设定数字我的驱动程序代码中的减速器数量为3我总是以3个减速器结束。



我使用 CDH 5.4.7 ,其中 Hadoop 2.6.0 在Google Compute Engine上的2节点群集上。 div>

想通了。原来如此愚蠢,但仍然发布的答案,以防万一有人也犯了同样的愚蠢错误。



看起来 job.setNumReduceTasks(3 );我的驱动程序类中的行优先于命令行中的 -D mapreduce.job.reduces = 10



当我从代码中删除 job.setNumReduceTasks(3); 行时,一切正常。


hadoop jar MapReduceTryouts-1.jar invertedindex.simple.MyDriver -D mapreduce.job.reduces=10 /user/notprabhu2/Input/potter/ /user/notprabhu2/output

I have been trying in vain to set the number of reducers through the -D option provided by GenericOptionParser but it does not seem to work and I have no idea why.

I tried -D mapreduce.job.reduces=10(with space after -D) and also

-Dmapreduce.job.reduces=10(without space after -D) but nothing seems to dodge.

In my Driver class I have implemented Tools.

package invertedindex.simple;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

public class MyDriver extends Configured implements Tool {

    @Override
    public int run(String[] args) throws Exception {

        Configuration conf = getConf();
        Job job = Job.getInstance(conf);

        job.setJarByClass(MyDriver.class);

        Path outputPath =  new Path(args[1]);
        outputPath.getFileSystem(getConf()).delete(outputPath, true);

        job.setMapperClass(MyMapper.class);
        job.setReducerClass(MyReducer.class);

        job.setInputFormatClass(TextInputFormat.class);
        job.setOutputFormatClass(TextOutputFormat.class);

        TextInputFormat.addInputPath(job, new Path(args[0]));
        TextOutputFormat.setOutputPath(job, outputPath);

        job.setNumReduceTasks(3);

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(Text.class);
        return job.waitForCompletion(true) ? 0 : 1;

    }

    public static void main(String[] args) throws Exception {
        int exitCode = ToolRunner.run(new Configuration(),new MyDriver(), args);
        System.exit(exitCode);
    }

}

Since I have explicitly set the number of reducers to 3 in my driver code I always end up with 3 reducers.

I am using CDH 5.4.7 which has Hadoop 2.6.0 on a 2 node cluster on Google Compute Engine.

解决方案

Figured it out. Turned out to be so silly but still posting the answer just in case someone also does the same silly mistake.

Seems the job.setNumReduceTasks(3); line in my driver class is taking precedence over the -D mapreduce.job.reduces=10 in the command line.

When I removed thejob.setNumReduceTasks(3); line from my code everything worked fine.

这篇关于无法通过通用选项解析器设置mapreduce.job.reduces的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-24 03:22