本文介绍了运行本地文件系统目录作为群集中映射器的输入的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我从本地文件系统给映射器的输入。它从eclipse中成功运行,但没有从群集运行,因为它无法找到本地输入路径:输入路径不存在。请任何人帮忙我如何给一个映射器的本地文件路径,以便它可以在群集中运行,并且我可以在hdfs中获得输出

解决方案

我不知道如何正确这个解决方案对我来说虽然有效。请注意是否有任何缺点,这是我做的。

,我意识到如果我从<$ c $修改

和我的 Driver class for MR 我添加了,

  Configuration conf = new Configuration(); 
conf.addResource(new Path(/ my / local / system / path / to / core-site.xml));
conf.addResource(新路径(/ usr / lib / hadoop-0.20-mapreduce / conf / hdfs-site.xml));

MR 从本地系统获取输入,将输出写入 hdfs:


I gave an input to the mapper from a local filesystem.It is running successfully from eclipse,But not running from the cluster as it is unable to find the local input path saying:input path does not exist.Please can anybody help me how to give a local file path to a mapper so that it can run in the cluster and i can get the output in hdfs

解决方案

This is a very old question. Recently faced the same issue.I am not aware of how correct this solution is it worked for me though. Please bring to notice if there are any drawbacks of this.Here's what I did.

Reading a solution from the mail-archives, I realised if i modify fs.default.name from hdfs://localhost:8020/ to file:/// it can access the local file system. However, I didnt want this for all my mapreduce jobs. So I made a copy of core-site.xml in a local system folder (same as the one from where I would submit my MR jar to hadoop jar).

and in my Driver class for MR I added,

Configuration conf = new Configuration();
conf.addResource(new Path("/my/local/system/path/to/core-site.xml"));
conf.addResource(new Path("/usr/lib/hadoop-0.20-mapreduce/conf/hdfs-site.xml"));

The MR takes input from local system and writes the output to hdfs:

这篇关于运行本地文件系统目录作为群集中映射器的输入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-26 05:08