1、docker安装centos镜像
 
2、安装java版本
  wget 下载jdk
   wget --no-cookies --no-check-certificate --header "Cookie:     gpw_e24=http%3A%2F%2Fwww.oracle.com%2F; oraclelicense=accept-securebackup-cookie"     "https://download.oracle.com/otn-pub/java/jdk/8u201-b09/42970487e3af4f5aa5bca3f542482c60/jdk-8u201-linux-x64.tar.gz"
3、下载hadoop最新版本
mkdir /usr/hadoop/
wget  mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-3.2.0/hadoop-3.2.0.tar.gz
tar -zxvf  hadoop-3.2.0.tar.gz
4、配置环境变量
vim /etc/profile
添加如下内容:
#JAVA VARIABLES START
export JAVA_HOME=/usr/java/jdk1.8.0_201
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin
#JAVA VARIABLES END
 
#HADOOP VARIABLES START
export HADOOP_HOME=/usr/local/hadoop/hadoop-2.8.3
#export HADOOP_INSTALL=$HADOOP_HOME
#export HADOOP_MAPRED_HOME=$HADOOP_HOME
#export HADOOP_COMMON_HOME=$HADOOP_HOME
#export HADOOP_HDFS_HOME=$HADOOP_HOME
#export YARN_HOME=$HADOOP_HOME
#export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$HADOOP_HOME/sbin:$HADOOP_HOME/bin:$PATH
#export CLASSPATH=$($HADOOP_HOME/bin/hadoop classpath):$CLASSPATH
#HADOOP VARIABLES END 
执行命令:source /etc/profile
 
保存镜像更新信息 docker commit 6ebd4423e2de hadoop-master
5、配置hadoop
hadoop配置文件修改
1).core-site.xml配置
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/usr/hadoop/hadoop-3.2.0/tmp</value>
  </property>
  <property>
    <name>fs.default.name</name>
 <value>hdfs://master:9000</value>
 <final>true</final>
  </property>
 
2).hdfs-site.xml配置
  <property>
    <name>dfs.replication</name>
    <value>2</value>
 <final>true</final>
  </property>
  <property>
    <name>dfs.namenode.name.dir</name>
 <value>/usr/hadoop/namenode</value>
  </property>
  <property>
    <name>dfs.datanode.data.dir</name>
 <value>/usr/local/hadoop/datanode</value>
  </property>
 
3).mapred-site.xml配置
  <property>
    <name>mapred.job.tracker</name>
    <value>master:9001</value>
  </property>
 
4)指定JAVA_HOME环境变量
vim /usr/hadoop/hadoop-3.2.0/etc/hadoop/hadoop-env.sh
修改JAVA_HOME=/usr/java/jdk1.8.0_201
 
5).格式化 namenode
cd /usr/hadoop/hadoop-3.2.0/bin
hadoop namenode -format
 
安装SSH
查看是否安装 rpm -qa | grep ssh
 安装SSH yum install openssh*
centos7设置SSH免密码登录
1、ssh-keygen -t rsa 生成公钥
 
2、把公钥文件放入授权文件中
cat id_rsa.pub >> authorized_keys
 
将镜像保存到新的容器
docker commit b243b3926f0a hadoop-basic
将hadoop-basic 创建master,slave1,slave2
运行如下命令:
docker run -p 50070:50070 -p 19888:19888 -p 8088:8088 --name master -ti -h master hadoop-master
docker run -it -h slave1 --name slave1 hadoop-slave1 /bin/bash
docker run -it -h slave2 --name slave2 hadoop-slave2 /bin/bash
 
hdfs dfsadmin -report 查看DataNode是否正常启动
 
 错误处理
 问题1:
   Starting namenodes on [localhost]
    ERROR: Attempting to operate on hdfs namenode as root
    ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting operation.
    Starting datanodes
    ERROR: Attempting to operate on hdfs datanode as root
    ERROR: but there is no HDFS_DATANODE_USER defined. Aborting operation.
    Starting secondary namenodes [bogon]
    ERROR: Attempting to operate on hdfs secondarynamenode as root
    ERROR: but there is no HDFS_SECONDARYNAMENODE_USER defined. Aborting operation.
    处理1
        $ vim sbin/start-dfs.sh
        $ vim sbin/stop-dfs.sh
    两处增加以下内容
HDFS_DATANODE_USER=root
HDFS_DATANODE_SECURE_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
    处理2
        $ vim sbin/start-yarn.sh
        $ vim sbin/stop-yarn.sh
    两处增加以下内容
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root
 
问题2:
localhost: ssh: connect to host localhost port 22: Cannot assign requested address
cd /etc/ssh
vim sshd_config
添加 Port 22
问题3:
Failed to get D-Bus connection: Operation not permitted
解决方法:docker run --privileged -ti -e "container=docker" -v /sys/fs/cgroup:/sys/fs/cgroup hadoop-master /usr/sbin/init
 
问题4:
sshd re-exec requires execution with an absolute path
在开启SSHD服务时报错.
sshd re-exec requires execution with an absolute path
用绝对路径启动,也报错如下:
Could not load host key: /etc/ssh/ssh_host_key
Could not load host key: /etc/ssh/ssh_host_rsa_key
Could not load host key: /etc/ssh/ssh_host_dsa_key
Disabling protocol version 1. Could not load host key
Disabling protocol version 2. Could not load host key
sshd: no hostkeys available — exiting
解决过程:
#ssh-keygen -t dsa -f /etc/ssh/ssh_host_dsa_key
#ssh-keygen -t rsa -f /etc/ssh/ssh_host_rsa_key
#/usr/sbin/sshd
执行后报错:
Could not load host key: /etc/ssh/ssh_host_ecdsa_key
Could not load host key: /etc/ssh/ssh_host_ed25519_key
解决过程:
#ssh-keygen -t dsa -f /etc/ssh/ssh_host_ecdsa_key
#ssh-keygen -t rsa -f /etc/ssh/ssh_host_ed25519_key
#/usr/sbin/sshd
 
hadoop集群搭建
 
 
问题5、
WARNING: HADOOP_SECURE_DN_USER has been replaced by HDFS_DATANODE_SECURE_USER. Using value of HADOOP_SECURE_DN_USER.
Starting namenodes on [master]
master: /usr/hadoop/hadoop-3.2.0/libexec/hadoop-functions.sh: line 982: ssh: command not found
Starting datanodes
Last login: Mon Jan 28 08:32:32 UTC 2019 on pts/0
localhost: /usr/hadoop/hadoop-3.2.0/libexec/hadoop-functions.sh: line 982: ssh: command not found
Starting secondary namenodes [b982e2adc393]
Last login: Mon Jan 28 08:32:33 UTC 2019 on pts/0
b982e2adc393: /usr/hadoop/hadoop-3.2.0/libexec/hadoop-functions.sh: line 982: ssh: command not found
Starting resourcemanager
Last login: Mon Jan 28 08:32:35 UTC 2019 on pts/0
Starting nodemanagers
Last login: Mon Jan 28 08:32:42 UTC 2019 on pts/0
localhost: /usr/hadoop/hadoop-3.2.0/libexec/hadoop-functions.sh: line 982: ssh: command not found
 
解决:
 $ vim sbin/start-dfs.sh
 $ vim sbin/stop-dfs.sh
将HADOOP_SECURE_DN_USER=hdfs替换为HADOOP_DATANODE_SECURE_DN_USER=hdfs
centos默认安装有ssh服务,没有客户端。
查看ssh安装
# rpm -qa | grep openssh
openssh-5.3p1-123.el6_9.x86_64
openssh-server-5.3p1-123.el6_9.x86_64
没有安装openssh-clients
yum安装ssh客户端
yum -y install openssh-clients
 
 
问题6、Failed to get D-Bus connection: Operation not permitted
问题7、docker: Error response from daemon: cgroups: cannot find cgroup mount destination: unknown.
没有找到具体的解决方法,重启后可以访问
 
问题8:Datanode denied communication with namenode because hostname cannot be resolved
 
01-30 14:56