编译:

0. 环境准备

maven(下载安装,配置环境变量,修改sitting.xml加阿里云镜像)

gcc-c++

zlib-devel

autoconf

automake

libtool

通过yum安装即可,yum -y install gcc-c++ lzo-devel zlib-devel autoconf automake libtool

1. 下载、安装并编译LZO

wget http://www.oberhumer.com/opensource/lzo/download/lzo-2.10.tar.gz

tar -zxvf lzo-2.10.tar.gz

cd lzo-2.10

./configure -prefix=/usr/local/hadoop/lzo/

make

make install

2. 编译hadoop-lzo源码

2.1 下载hadoop-lzo的源码,下载地址:https://github.com/twitter/hadoop-lzo/archive/master.zip

2.2 解压之后,修改pom.xml
     <hadoop.current.version>2.7.2</hadoop.current.version>

2.3 声明两个临时环境变量
      export C_INCLUDE_PATH=/usr/local/hadoop/lzo/include
      export LIBRARY_PATH=/usr/local/hadoop/lzo/lib

2.4 编译
     进入hadoop-lzo-master,执行maven编译命令
     mvn package -Dmaven.test.skip=true

2.5 进入target,将hadoop-lzo-0.4.21-SNAPSHOT.jar放到hadoop的classpath下,如${HADOOP_HOME}/share/hadoop/common

2.6 修改core-site.xml增加配置支持LZO压缩
     <configuration>
         <property>
             <name>io.compression.codecs</name>
             <value>
             org.apache.hadoop.io.compress.GzipCodec,
             org.apache.hadoop.io.compress.DefaultCodec,
             org.apache.hadoop.io.compress.BZip2Codec,
             org.apache.hadoop.io.compress.SnappyCodec,
             com.hadoop.compression.lzo.LzoCodec,
             com.hadoop.compression.lzo.LzopCodec
             </value>
         </property>
         <property>
             <name>io.compression.codec.lzo.class</name>
             <value>com.hadoop.compression.lzo.LzoCodec</value>
         </property>
     </configuration>

<mirror>
         <id>nexus-aliyun</id>
         <mirrorOf>*</mirrorOf>
         <name>Nexus aliyun</name>
         <url>http://maven.aliyun.com/nexus/content/groups/public</url>

</mirror>

配置lzo:

1)先下载lzo的jar项目

https://github.com/twitter/hadoop-lzo/archive/master.zip

2)下载后的文件名是hadoop-lzo-master,它是一个zip格式的压缩包,先进行解压,然后用maven编译。生成hadoop-lzo-0.4.20.jar。

3)将编译好后的hadoop-lzo-0.4.20.jar 放入hadoop-2.7.2/share/hadoop/common/

[atguigu@hadoop102 common]$ pwd

/opt/module/hadoop-2.7.2/share/hadoop/common

[atguigu@hadoop102 common]$ ls

hadoop-lzo-0.4.20.jar

4)同步hadoop-lzo-0.4.20.jar到hadoop103、hadoop104

[atguigu@hadoop102 common]$ xsync hadoop-lzo-0.4.20.jar

core-site.xml增加配置支持LZO压缩

<?xml version="1.0" encoding="UTF-8"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>

<property>

<name>io.compression.codecs</name>

<value>

org.apache.hadoop.io.compress.GzipCodec,

org.apache.hadoop.io.compress.DefaultCodec,

org.apache.hadoop.io.compress.BZip2Codec,

org.apache.hadoop.io.compress.SnappyCodec,

com.hadoop.compression.lzo.LzoCodec,

com.hadoop.compression.lzo.LzopCodec

</value>

</property>

<property>

<name>io.compression.codec.lzo.class</name>

<value>com.hadoop.compression.lzo.LzoCodec</value>

</property>

</configuration>

5)同步core-site.xml到hadoop103、hadoop104

[atguigu@hadoop102 hadoop]$ xsync core-site.xml

6)启动及查看集群

[atguigu@hadoop102 hadoop-2.7.2]$ sbin/start-dfs.sh

)web和进程查看

Ø Web查看:http://hadoop102:50070

Ø 进程查看:jps查看各个节点状态。

(2)当启动发生错误的时候:

Ø 查看日志:/home/atguigu/module/hadoop-2.7.2/logs

Ø 如果进入安全模式,可以通过hdfs dfsadmin -safemode leave

Ø 停止所有进程,删除data和log文件夹,然后hdfs namenode -format 来格式化

hadoop jar /opt/module/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount -Dmapreduce.output.fileoutputformat.compress=true -Dmapreduce.output.fileoutputformat.compress.codec=com.hadoop.compression.lzo.LzopCodec /input /output //测试
05-11 14:59