问题描述
我想开始将Avro与Map Reduce一起使用.有人可以建议一个好的入门教程/示例吗?我无法通过互联网搜索找到很多东西.
I want to get started with using Avro with Map Reduce. Can Someone suggest a good tutorial / example to get started with. I couldnt find much through the internet search.
推荐答案
我最近做了一个项目,该项目很大程度上基于Avro数据,并且以前没有使用过这种数据格式,所以我不得不从头开始.您是对的,在开始使用Avro时,很难从在线资源中获得很多帮助.我向您推荐的材料是:
I recently did a project that was heavily based on Avro data and not having used this data format before, I had to start from scratch. You are right in that it is rather hard to get much help from online sources when getting started with Avro. The material that I would recommend to you is:
- 到目前为止,我发现的最有用的资源是汤姆·怀特(Tom White)的 Hadoop: 《权威指南》 以及他的 Github页面他在书中使用的代码.
- 有关其他代码示例,我查看了Ron Bodkin的Github页面 avro-mr-sample .
- 就我而言,我使用Python读取和写入Avro文件,为此我使用了教程.
- 即使显而易见,我也会将链接添加到"Avro用户"邮件列表中.在这里可以找到大量的信息,在阅读了上述材料并实现了很多代码之后,我发现自己花了很多时间浏览档案.
- By far, the most helpful source that I found was the Avro section (p103-p116) in Tom White's Hadoop: The Definitive Guide book as well as his Github page for the code he uses in the book.
- For additional code examples I looked at Ron Bodkin's Github page avro-mr-sample.
- In my case I used Python for reading and writing Avro files and for that I used this tutorial.
- Even though it is obvious, I will add the link to the Avro Users mailing list. There is a ton of information to be found there and after I had read the above material and implemented a bunch of code, I found myself spending hours looking through the archives.
最后,我对您的最后建议是将 Avro 1.4.1与Hadoop 0.20.2一起使用,并且仅使用该组合.在使用Hadoop 0.21和更新的Avro版本运行代码时,我遇到了一些重大问题.
Finally, my last suggestion to you is to use Avro 1.4.1 with Hadoop 0.20.2 and ONLY that combination. I had some major issues getting my code to run using Hadoop 0.21 and more recent Avro versions.
这篇关于Avro入门的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!