问题描述
为什么Hadoop需要引入这些新类?它们似乎使界面复杂化了。为了在Hadoop中处理对象,例如,hadoop使用 Text
而不是java的 String
。 hadoop中的 Text
类与java String
类似,但是 Text
实现了像 Comparable
, Writable
和 WritableComparable
。 这些接口对于MapReduce都是必需的;使用 Comparable
接口来比较reducer何时对键进行排序,并且 Writable
可以将结果写入本地磁盘。它不使用java Serializable
,因为java Serializable
对于hadoop来说太大或太重, Writable
可以以非常轻的方式序列化hadoop对象。
Why does Hadoop need to introduce these new classes? They just seem to complicate the interface
In order to handle the Objects in Hadoop way. For example, hadoop uses Text
instead of java's String
. The Text
class in hadoop is similar to a java String
, however, Text
implements interfaces like Comparable
, Writable
and WritableComparable
.
These interfaces are all necessary for MapReduce; the Comparable
interface is used for comparing when the reducer sorts the keys, and Writable
can write the result to the local disk. It does not use the java Serializable
because java Serializable
is too big or too heavy for hadoop, Writable
can serializable the hadoop Object in a very light way.
这篇关于为什么Hadoop需要类如Text或IntWritable而不是String或Integer?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!