说明:hadoop用的是hadoop-2.5.0-cdh5.3.6
Oozie用的是oozie-4.0.0-cdh5.3.6
该测试环境用户名为hadoop 主机名为hadoop01
2.4安装部署
(1)下载上传到目录: /opt/software/cdh-5.3.6
一个安装包 一个ExtJS包用于提供Oozie界面
(2)解压到根目录的opt/cdh-5.3.6下面
[hadoop@hadoop01 cdh-5.3.6]$ tar -zxvf oozie-4.0.0-cdh5.3.6.tar.gz -C /opt/cdh-5.3.6/
(3)我们安装的是oozieserver
OOZIE_HOME不用设置,是自动配置的
(4)配置hadoop的代理
利用nodepad++配置CDH版本的hadoop下面的配置文件core-site.xml
在core-site.xml配置两项
<!-- OOZIE 都修改为当前用户 此处为hadoop-->
<property>
<name>hadoop.proxyuser.hadoop.hosts</name> 【配置当前用户】
<value>*</value> 【oozie安装的主机名 *表示所有 为了便捷】
</property>
<property>
<name>hadoop.proxyuser.hadoop.groups</name> 【配置当前用户】
<value>*</value> 【配置ooize的同组用户 *表示所有 为了便捷】
</property>
【注意】
属性中的name标红的部分配置的是当前用户名,不是主机名,如果配置错误会报错;
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: hadoop is not allowed to impersonate hadoop
获取用户名:
[hadoop@hadoop01 hadoop-2.5.0-cdh5.3.6]$ whoami
Hadoop
获取主机名:
[hadoop@hadoop01 hadoop-2.5.0-cdh5.3.6]$ hostname
hadoop01
(5)重启hadoop集群——有时间编写启动和关闭脚本
[hadoop@hadoop01 hadoop-2.5.0-cdh5.3.6]$ pwd
/opt/cdh-5.3.6/hadoop-2.5.0-cdh5.3.6
[hadoop@hadoop01 hadoop-2.5.0-cdh5.3.6]$ sbin/hadoop-daemon.sh stop namenode
stopping namenode
[hadoop@hadoop01 hadoop-2.5.0-cdh5.3.6]$ sbin/hadoop-daemon.sh stop datanode
stopping datanode
[hadoop@hadoop01 hadoop-2.5.0-cdh5.3.6]$ sbin/yarn-daemon.sh stop resourcemanager
stopping resourcemanager
[hadoop@hadoop01 hadoop-2.5.0-cdh5.3.6]$ sbin/yarn-daemon.sh stop nodemanager
stopping nodemanager
[hadoop@hadoop01 hadoop-2.5.0-cdh5.3.6]$ sbin/mr-jobhistory-daemon.sh stop historyserver
stopping historyserver
[hadoop@hadoop01 hadoop-2.5.0-cdh5.3.6]$ jps
4169 Jps
[hadoop@hadoop01 hadoop-2.5.0-cdh5.3.6]$ sbin/hadoop-daemon.sh start namenode
[hadoop@hadoop01 hadoop-2.5.0-cdh5.3.6]$ sbin/hadoop-daemon.sh start datanode
[hadoop@hadoop01 hadoop-2.5.0-cdh5.3.6]$ sbin/yarn-daemon.sh start resourcemanager
[hadoop@hadoop01 hadoop-2.5.0-cdh5.3.6]$ sbin/yarn-daemon.sh start nodemanager
[hadoop@hadoop01 hadoop-2.5.0-cdh5.3.6]$ sbin/mr-jobhistory-daemon.sh start historyserver
[hadoop@hadoop01 hadoop-2.5.0-cdh5.3.6]$ jps
4199 NameNode
4286 DataNode
4375 ResourceManager
4792 JobHistoryServer
4632 NodeManager
4823 Jps
[hadoop@hadoop01 hadoop-2.5.0-cdh5.3.6]$
(6) 解压hadooplib 生成一个文件夹oozie-4.0.0-cdh
在oozie家目录下执行解压命令,该文件夹中包含了oozie使用需要的不同版本的hadoop的jar包
[hadoop@hadoop01 oozie-4.0.0-cdh5.3.6]$ tar -zxvf oozie-hadooplibs-4.0.0-cdh5.3.6.tar.gz
(7)在Oozie的home目录下创建libext文件夹
[hadoop@hadoop01 oozie-4.0.0-cdh5.3.6]$ mkdir libext
(8)拷贝hadooplib的jar包到libext中,注意是拷贝jar不是拷贝文件夹
[hadoop@hadoop01 oozie-4.0.0-cdh5.3.6]$
cp oozie-4.0.0-cdh5.3.6/hadooplibs/hadooplib-2.5.0-cdh5.3.6.oozie-4.0.0-cdh5.3.6/* libext/
(9)拷贝js包到libext中 注意:不用解压,拷贝zip包就可以了
[hadoop@hadoop01 oozie-4.0.0-cdh5.3.6]$ cp /opt/software/cdh-5.3.6/ext-2.2.zip libext/
查看是否有ext包
[hadoop@hadoop01 oozie-4.0.0-cdh5.3.6]$ ls libext/ |grep ext
(10)创建sharelib库:
理解:sharelib是创建在HDFS上面,用于运行所有job的依赖;Oozie上面大部分跑的是mapreduce任务,需要提供各种框架的jar包,而这些jar默认输入 输出的都是HDFS,所以需要这些依赖jar包。
[hadoop@hadoop01 oozie-4.0.0-cdh5.3.6]$ bin/oozie-setup.sh sharelib create -fs hdfs://hadoop01:8020 -locallib oozie-sharelib-4.0.0-cdh5.3.6-yarn.tar.gz
创建成功效果图如下
查看web端口
(11)打包
打包 war——封装所有的jar包, 时间可能较长
[hadoop@hadoop01 oozie-4.0.0-cdh5.3.6]$ bin/oozie-setup.sh prepare-war
成功后提示:
INFO: Oozie is ready to be started
(12)初始化数据库
[hadoop@hadoop01 oozie-4.0.0-cdh5.3.6]$ bin/ooziedb.sh create -sqlfile oozie.sql -run DB Connection
(13)启动oozie实例
[hadoop@hadoop01 oozie-4.0.0-cdh5.3.6]$ bin/oozied.sh start
注意:如果报pid文件存在导致不能启动 到路径下删除pid文件
(14)查看进程
(15)浏览器查看 端口为11000 http://hadoop01:11000/oozie/