Hadoop需要类如Text或IntWritable而不是Str

Hadoop需要类如Text或IntWritable而不是Str

本文介绍了为什么Hadoop需要类如Text或IntWritable而不是String或Integer?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为什么Hadoop需要引入这些新类?它们似乎使界面复杂化了。为了在Hadoop中处理对象,例如,hadoop使用 Text 而不是java的 String 。 hadoop中的 Text 类与java String 类似,但是 Text 实现了像 Comparable Writable WritableComparable

这些接口对于MapReduce都是必需的;使用 Comparable 接口来比较reducer何时对键进行排序,并且 Writable 可以将结果写入本地磁盘。它不使用java Serializable ,因为java Serializable 对于hadoop来说太大或太重, Writable 可以以非常轻的方式序列化hadoop对象。

Why does Hadoop need to introduce these new classes? They just seem to complicate the interface

解决方案

In order to handle the Objects in Hadoop way. For example, hadoop uses Text instead of java's String. The Text class in hadoop is similar to a java String, however, Text implements interfaces like Comparable, Writable and WritableComparable.

These interfaces are all necessary for MapReduce; the Comparable interface is used for comparing when the reducer sorts the keys, and Writable can write the result to the local disk. It does not use the java Serializable because java Serializable is too big or too heavy for hadoop, Writable can serializable the hadoop Object in a very light way.

这篇关于为什么Hadoop需要类如Text或IntWritable而不是String或Integer?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-24 03:38