httpfs安装指南
安装环境
Linux
maven3
jdk1.6
本地的maven源(有些依赖的jar包Cloudera已不再维护)
- 1.下载httfs源代码包
https://github.com/cloudera/httpfs
使用git下载
git clone https://github.com/cloudera/httpfs.git
- 2.改动pom.xml文件
在<dependencies>中添加依赖
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>${cdh.hadoop.version}</version>
</dependency>
- 3.下载所须要的依赖,
mvn clean:install
当中有些依赖的jar包已不在Cloudera的源上了,须要自己设置maven源,在~/.m2/setting.xml中添加自己的源
- 4.编译打包
mvn package -Pdist
生成的hadoop-hdfs-httpfs-0.20.2-cdh3u6.tar.gz包在target文件夹下
- 5.改动hadoop集群的全部机器的core-site.xml文件
在当中增加下面内容
<property>
<name>hadoop.proxyuser.httpfs.hosts</name>
<value>httpfs-host.foo.com</value>
</property>
<property>
<name>hadoop.proxyuser.httpfs.groups</name>
<value>*</value>
</property>
重新启动hadoop集群
- 6.在要安装httpfs的机器上创建httpfs用户
useradd --create-home --shell /bin/bash httpfs
passwd httpfs
- 7.安装httpfs
将hadoop-hdfs-httpfs-0.20.2-cdh3u6.tar.gz包拷贝到/home/httpfs文件夹下解压
进入到解压出来的文件夹hadoop-hdfs-httpfs-0.20.2-cdh3u6
将现网集群的hadoop配置文件core-site.xml和hdfs-site.xml拷贝到/home/httpfs/hadoop-hdfs-httpfs-0.20.2-cdh3u6/etc/hadoop文件夹下
- 8.改动httpfs-site.xml
在当中增加
<property>
<name>httpfs.proxyuser.httpfs.hosts</name>
<value>*</value>
</property>
<property>
<name>httpfs.proxyuser.httpfs.groups</name>
<value>*</value>
</property>
- 9.启动httpfs
使用httpfs用户启动
/home/httpfs/hadoop-hdfs-httpfs-0.20.2-cdh3u6/sbin/httpfs.sh start
- 10.检查
检查进程是否存在:jps看看有没有Bootstrap进程
查看logs文件夹下httpfs.log和其它log有无异常信息
- 11.curl測试
上传文件
curl -i -X PUT "http://172.16.61.154:14000/webhdfs/v1/tmp/testfile?user.name=bdws&op=create"
依据返回回来的URL再次put
curl -i -X PUT -T test.txt --header "Content-Type: application/octet-stream" "http://172.16.61.154:14000/webhdfs/v1/tmp/testfile?op=CREATE&user.name=bdws&data=true"
下载文件
curl -i "http://172.16.61.154:14000/webhdfs/v1/tmp/testfile?user.name=bdws&op=open"
HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
Set-Cookie: hadoop.auth="u=bdws&p=bdws&t=simple&e=1400181237161&s=F5K1C44TbM/tMjbdFUpM+zExtso="; Version=1; Path=/
Content-Type: application/octet-stream
Content-Length: 20
Date: Thu, 15 May 2014 09:13:57 GMT
this is a test file
- 12.參考:
Hadoop HDFS over HTTP 0.20.2-cdh3u6 - Server Setup
http://cloudera.github.io/httpfs/ServerSetup.html
WebHDFS说明,非常具体包含命令的使用
http://zhangjie.me/webhdfs/
Apache hadoop webhdfs api文档
http://hadoop.apache.org/docs/r1.0.4/webhdfs.html