本文介绍了如何在Hadoop中使用CompressionCodec的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在做以下压缩o / p文件从reducer:
OutputStream out = ipFs.create路径(opDir +/+ fileName));
CompressionCodec codec = new GzipCodec();
OutputStream cs = codec.createOutputStream(out);
BufferedWriter cout = new BufferedWriter(new OutputStreamWriter(cs));
cout.write(...)
但在第3行中有空指针异常:
java.lang.NullPointerException
at org.apache.hadoop.io.compress.zlib.ZlibFactory.isNativeZlibLoaded ZlibFactory.java:63)
at org.apache.hadoop.io.compress.GzipCodec.createOutputStream(GzipCodec.java:92)
at myFile $ myReduce.reduce(myFile.java:354)
我也有以下。
解决方案
你应该使用CompressionCodecFactory如果你想要使用标准OutputFormat处理之外的压缩(如@linker答案中所述):
CompressionCodecFactory ccf = new CompressionCodecFactory )
CompressionCoec codec = ccf.getCodecByClassName(GZipCodec.class.getName());
OutputStream compressedOutputSream = codec.createOutputStream(outputStream)
I am doing following to do compression of o/p files from reducer:
OutputStream out = ipFs.create( new Path( opDir + "/" + fileName ) );
CompressionCodec codec = new GzipCodec();
OutputStream cs = codec.createOutputStream( out );
BufferedWriter cout = new BufferedWriter( new OutputStreamWriter( cs ) );
cout.write( ... )
But got null pointer exception in line 3:
java.lang.NullPointerException
at org.apache.hadoop.io.compress.zlib.ZlibFactory.isNativeZlibLoaded(ZlibFactory.java:63)
at org.apache.hadoop.io.compress.GzipCodec.createOutputStream(GzipCodec.java:92)
at myFile$myReduce.reduce(myFile.java:354)
I also got following JIRA for the same.
Can you please suggest if I am doing something wrong?
解决方案
You should use the CompressionCodecFactory if you want to use compression outside of the standard OutputFormat handling (as detailed in @linker answer):
CompressionCodecFactory ccf = new CompressionCodecFactory(conf)
CompressionCoec codec = ccf.getCodecByClassName(GZipCodec.class.getName());
OutputStream compressedOutputSream = codec.createOutputStream(outputStream)
这篇关于如何在Hadoop中使用CompressionCodec的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!