使用eclipse 运行MapReduce程序:
1 编译生成hadoop-eclipse插件
OS:ubuntu12.04.4
Hadoop:1.2.1
Eclipse:4.3.0
注意编辑下面三个文件,参考:
http://www.kankanews.com/ICkengine/archives/63441.shtml
http://www.linuxidc.com/Linux/2013-10/91666p2.htm
(1)配置${hadoop-home}/src/contrib/eclipse-plugins/build.xml
找到 ,添加
找到
(2)配置${hadoop-home}/src/contrib/build-contrib.xml
指定hadoop版本和eclipse路径,添加下面三行:
(3)配置${hadoop-home}/src/contrib/eclipse-plugins/META-INF/MANIFEST.MF
在Bundle-ClassPath:classes/,lib/hadoop-core.jar后边添加lib/jackson-core-asl-1.8.8.jar,lib/jackson-mapper-asl-1.8.8.jar,lib/commons-configuration-1.6.jar,lib/commons-lang-2.4.jar,lib/commons-httpclient-3.0.1.jar,lib/commons-cli-1.2.jar
(4)开始编译
首先要安装好ant,ant安装比较简单,只要能执行ant命令即可.
进行目录~/hadoop/hadoop-1.2.1/src/contrib/eclipse-plugin,执行:
[~/hadoop/hadoop-1.2.1/src/contrib/eclipse-plugin]$ant jar
Buildfile:/home/huangxing/hadoop/hadoop-1.2.1/src/contrib/eclipse-plugin/build.xml
check-contrib:
init:
[echo] contrib: eclipse-plugin
init-contrib:
ivy-download:
[get] Getting:http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar
[get] To:/home/huangxing/hadoop/hadoop-1.2.1/ivy/ivy-2.1.0.jar
[get] Not modified - so not downloade
ivy-probe-antlib:
ivy-init-antlib:
ivy-init:
[ivy:configure]:: Ivy 2.1.0 - 20090925235825 :: http://ant.apache.org/ivy/ ::
[ivy:configure]:: loading settings :: file =/home/huangxing/hadoop/hadoop-1.2.1/ivy/ivysettings.xml
ivy-resolve-common:
ivy-retrieve-common:
[ivy:cachepath]DEPRECATED: 'ivy.conf.file' is deprecated, use 'ivy.settings.file' instead
[ivy:cachepath]:: loading settings :: file =/home/huangxing/hadoop/hadoop-1.2.1/ivy/ivysettings.xml
compile:
[echo] contrib: eclipse-plugin
[javac]/home/huangxing/hadoop/hadoop-1.2.1/src/contrib/eclipse-plugin/build.xml:64:warning: 'includeantruntime' was not set, defaulting tobuild.sysclasspath=last; set to false for repeatable build
jar:
[copy] Copying 1 file to/home/huangxing/hadoop/hadoop-1.2.1/build/contrib/eclipse-plugin/lib
[copy] Copying/home/huangxing/hadoop/hadoop-1.2.1/hadoop-core-1.2.1.jar to/home/huangxing/hadoop/hadoop-1.2.1/build/contrib/eclipse-plugin/lib/hadoop-core.jar
[copy] Copying 1 file to/home/huangxing/hadoop/hadoop-1.2.1/build/contrib/eclipse-plugin/lib
[copy] Copying/home/huangxing/hadoop/hadoop-1.2.1/lib/commons-cli-1.2.jar to/home/huangxing/hadoop/hadoop-1.2.1/build/contrib/eclipse-plugin/lib/commons-cli-1.2.jar
[copy] Copying 1 file to/home/huangxing/hadoop/hadoop-1.2.1/build/contrib/eclipse-plugin/lib
[copy] Copying/home/huangxing/hadoop/hadoop-1.2.1/lib/commons-lang-2.4.jar to/home/huangxing/hadoop/hadoop-1.2.1/build/contrib/eclipse-plugin/lib/commons-lang-2.4.jar
[copy] Copying 1 file to/home/huangxing/hadoop/hadoop-1.2.1/build/contrib/eclipse-plugin/lib
[copy] Copying /home/huangxing/hadoop/hadoop-1.2.1/lib/commons-configuration-1.6.jarto/home/huangxing/hadoop/hadoop-1.2.1/build/contrib/eclipse-plugin/lib/commons-configuration-1.6.jar
[copy] Copying 1 file to/home/huangxing/hadoop/hadoop-1.2.1/build/contrib/eclipse-plugin/lib
[copy] Copying/home/huangxing/hadoop/hadoop-1.2.1/lib/jackson-mapper-asl-1.8.8.jar to/home/huangxing/hadoop/hadoop-1.2.1/build/contrib/eclipse-plugin/lib/jackson-mapper-asl-1.8.8.jar
[copy] Copying 1 file to/home/huangxing/hadoop/hadoop-1.2.1/build/contrib/eclipse-plugin/lib
[copy] Copying/home/huangxing/hadoop/hadoop-1.2.1/lib/jackson-core-asl-1.8.8.jar to/home/huangxing/hadoop/hadoop-1.2.1/build/contrib/eclipse-plugin/lib/jackson-core-asl-1.8.8.jar
[copy] Copying 1 file to/home/huangxing/hadoop/hadoop-1.2.1/build/contrib/eclipse-plugin/lib
[copy] Copying/home/huangxing/hadoop/hadoop-1.2.1/lib/commons-httpclient-3.0.1.jar to /home/huangxing/hadoop/hadoop-1.2.1/build/contrib/eclipse-plugin/lib/commons-httpclient-3.0.1.jar
[jar] Building jar:/home/huangxing/hadoop/hadoop-1.2.1/build/contrib/eclipse-plugin/hadoop-eclipse-plugin-1.2.1.jar
BUILDSUCCESSFUL
Total time: 11seconds
(5)拷贝生成的插件到eclipse的插件目录:
[~]$cp/home/huangxing/hadoop/hadoop-1.2.1/build/contrib/eclipse-plugin/hadoop-eclipse-plugin-1.2.1.jar ~/eclipse/eclipse/plugins/
(6)重启eclipse
遗憾的是,eclipse怎么都没法连接HDFS,报下面错:
使用netstat –tnlp发现,貌似只能接受ipv6客户端,所以我就禁止ipv6了。
ubuntu 12.10关闭ipv6的方法:
cat /proc/sys/net/ipv6/conf/all/disable_ipv6
显示0说明ipv6开启,1说明关闭
如果开启,关闭ipv6的方法:
在 /etc/sysctl.conf 增加下面几行,重启系统:
#disable IPv6
net.ipv6.conf.all.disable_ipv6= 1
net.ipv6.conf.default.disable_ipv6= 1
net.ipv6.conf.lo.disable_ipv6 =1
重启后,exlipse竟然还报同样的错误,看来和ipv6没关系。经诊断,是因为没法连接本机的127.0.0.1:90001,例如,下面语句正常:
[~/hadoop/hadoop-1.2.1/conf]$hadoopdfs -ls /
Found 2 items
drwxr-xr-x - huangxing supergroup 0 2014-02-27 20:14 /tmp
drwxr-xr-x - huangxing supergroup 0 2014-02-27 19:45 /user
但是下面语句却报错:
[~/hadoop/hadoop-1.2.1/conf]$hadoop dfs -lshdfs://127.0.0.1:9000/
14/02/2722:03:20 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000.Already tried 0 time(s); retry policy isRetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/02/2722:03:21 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000.Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,sleepTime=1 SECONDS)
14/02/2722:03:22 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000.Already tried 2 time(s); retry policy isRetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
不知道为什么会这样,暂且记着这个奇葩问题。
core-site.xml和mapred-site.xml配置文件中使用了localhost,我修改那些配置文件,没用localhost,用本机的以太网卡ip,接着重启hadoop相关进程,正常了!,下面开始配置:
2 建立MapReduce project:
(1)建立MapReduce项目:
(2) 复制hadoop/hadoop-1.2.1/src/examples/org/apache/hadoop/examples/WordCount.java文件到新建项目的src目录,打开这个文件
(3) 上传一个文本文件(README.html)到hadoop目录/user/huangxing/input:
(4) 配置执行参数;
注意,URI要写全,否则会报错:找不到文件!!
(5) 点击run直接执行