问题描述
我在Ubuntu上安装了一个节点集群。
我正在尝试执行wordcount程序。
我已经创建了jar文件。
但是当我要执行这个命令时:
I have installed a single node cluster on ubuntu.I am trying to execute wordcount program.I have created jar file.But when I am going to execute this command:
hadoop jar '/home/hduser/Desktop/TutorialFolder/firstTutorial.jar' WordCount /home/hduser/Desktop/TutorialFolder/input_data /TutorialFolder/Output
它是给出以下错误:
It is giving below error:
Exception in thread "main"
org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist:
hdfs://localhost:9000/home/hduser/Desktop/TutorialFolder/input_data
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:323)
推荐答案
给mapreduce的输入和输出路径必须是有效的HDFS路径,除非您以本地模式运行Hadoop。
The Input and Output paths given to mapreduce must be valid HDFS paths unless you are running Hadoop in local mode.
输入路径 / home / hduser / Desktop / TutorialFolder / input_data
看起来像本地目录。在HDFS中创建一个类似的结构,并将输入数据上传到HDFS中的目录中。
The Input path /home/hduser/Desktop/TutorialFolder/input_data
passed here looks like a local directory. Create a similar structure in HDFS and upload the input data to that directory in HDFS.
hdfs dfs -mkdir -p /wordcount/input_data
hdfs dfs -put /home/hduser/Desktop/TutorialFolder/input_data/ /wordcount/input_data/
输出路径也必须在HDFS中。
并使用HDFS路径运行jar,
Also the Output path must be in HDFS.And Run the jar with the HDFS paths,
hadoop jar /home/hduser/Desktop/TutorialFolder/firstTutorial.jar WordCount /wordcount/input_data /wordcount/output
这篇关于运行WordCount MapReduce时,输入路径不存在的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!