本文介绍了运行WordCount MapReduce时,输入路径不存在的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Ubuntu上安装了一个节点集群。
我正在尝试执行wordcount程序。
我已经创建了jar文件。
但是当我要执行这个命令时:

I have installed a single node cluster on ubuntu.I am trying to execute wordcount program.I have created jar file.But when I am going to execute this command:

hadoop jar '/home/hduser/Desktop/TutorialFolder/firstTutorial.jar' WordCount /home/hduser/Desktop/TutorialFolder/input_data /TutorialFolder/Output

它是给出以下错误:

It is giving below error:

Exception in thread "main"
org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist:
hdfs://localhost:9000/home/hduser/Desktop/TutorialFolder/inp‌​ut_data
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.single‌​ThreadedListStatus(F‌​ileInputFormat.java:‌​323)


推荐答案

给mapreduce的输入和输出路径必须是有效的HDFS路径,除非您以本地模式运行Hadoop。

The Input and Output paths given to mapreduce must be valid HDFS paths unless you are running Hadoop in local mode.

输入路径 / home / hduser / Desktop / TutorialFolder / input_data 看起来像本地目录。在HDFS中创建一个类似的结构,并将输入数据上传到HDFS中的目录中。

The Input path /home/hduser/Desktop/TutorialFolder/input_data passed here looks like a local directory. Create a similar structure in HDFS and upload the input data to that directory in HDFS.

hdfs dfs -mkdir -p /wordcount/input_data
hdfs dfs -put /home/hduser/Desktop/TutorialFolder/input_data/ /wordcount/input_data/

输出路径也必须在HDFS中。
并使用HDFS路径运行jar,

Also the Output path must be in HDFS.And Run the jar with the HDFS paths,

hadoop jar /home/hduser/Desktop/TutorialFolder/firstTutorial.jar WordCount /wordcount/input_data /wordcount/output

这篇关于运行WordCount MapReduce时,输入路径不存在的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-04 12:24