本文介绍了将参数传递给Hadoop映射器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在使用新的Hadoop API ,并寻找将某些参数(少量字符串)传递给映射器的方法。
我该怎么做?
I'm using new Hadoop API and looking for a way to pass some parameters (few strings) to mappers.
How can I do that?
:
This solutions works for old API:
JobConf job = (JobConf)getConf();
job.set("NumberOfDocuments", args[0]);
这里, NumberOfDocuments
是名称的参数,它的值从 args [0]
,一个命令行参数中读取。一旦你设置了这个参数,你可以在reducer或mapper中检索它的值,如下所示:
Here, "NumberOfDocuments
" is the name of parameter and its value is read from "args[0]
", a command line argument. Once you set this arguments, you can retrieve its value in reducer or mapper as follows:
private static Long N;
public void configure(JobConf job) {
N = Long.parseLong(job.get("NumberOfDocuments"));
}
请注意,棘手的部分是您无法设置像这样的参数: p>
Note, the tricky part is that you cannot set parameters like this:
Configuration con = new Configuration();
con.set("NumberOfDocuments", args[0]);
推荐答案
使用,同时运行作业。
Configuration conf = new Configuration();
conf.set("test", "123");
Job job = new Job(conf);
在mapper / reducer中,获取参数为
In the mapper/reducer get the parameter as
Configuration conf = context.getConfiguration();
String param = conf.get("test");
这篇关于将参数传递给Hadoop映射器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!