问题描述
我正在研究简单的地图缩减程序。我想在缩减器之后为密钥中的每个不同的单词创建不同的文件。例如,在执行Mapreduce之后,我有类似于
Priority1 x 2
Priority1 y 2
Priority1 z 2
priority2 x 2
priority2 y 2
现在我想在缩小阶段之后使用不同的文件,并说优先级1和优先级2根据优先级具有所有这些值。我正在使用java,并想知道应该写在reducer中有这种输出?
我只想知道这是甚至可能或者如果它是如何处理或解决这个问题?
我使用Hadoop 0.20.203,因此multipleoutputs不起作用。
任何指针都会有帮助。
感谢您的帮助!
Atul
您需要创建 partioner 类首先,基于你的标准,该部分。
然后你需要创建你自己的 outputformat class和一个 recordwriter class。
recordwriter 类需要根据需要写入不同的文件。此外,如果您需要对值进行排序,请为您的关键字段创建比较器类。
I am working on simple map reduce program. I want to create different files after reducer for each different word in the key. For example, after executing Mapreduce I have something like
Priority1 x 2
Priority1 y 2
Priority1 z 2
priority2 x 2
priority2 y 2
Now I want different files after reduce phase, saying Priority1 and Priority2 which have all these values according to the priority. I am using java and want to know what should be written in reducer for having this kind of output?
I just want to know if this is even possible or if it is how to approach or solve this?I am using Hadoop 0.20.203 and hence multipleoutputs doesn't work.
Any pointers will be helpful.Thanks for the help!Atul
You need to create a partioner class first, that partions based on your criteria.
You then need to create your own outputformat class and a recordwriter class.
The recordwriter class, needs to write to different files as per your needs. Further if you need to sort your values create comparator class for your key field.
这篇关于减速器中有多个输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!