本文介绍了Hadoop Writable和java.io.serialization之间的连接和区别是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

要实现 Writable 接口,可以在Hadoop中对对象进行序列化。那么Hadoop Writable java.io.serialization

$ b $之间的联系和区别是什么b

解决方案

底层存储差异:



/ b>

可序列化不认为存储值的类是已知的,并且标记具有它的类的实例。它会写入关于对象的元数据,其中包括类名称,字段名称和类型及其超类。 ObjectOutputStream和ObjectInputStream在某种程度上优化了这一点,因此5字节句柄是在第一个类的实例之后写入的。但是带有句柄的对象序列不能随机访问,因为它们依赖于流状态。这使得排序变得复杂。



Hadoop可写
$ b 当定义一个Writable ,你知道预期的课程。因此,在反序列化过程中,Writable不会将其类型存储在序列化表示中,您知道预期的内容。例如。如果输入键是一个LongWritable,那么就要求一个空的LongWritable实例从输入数据流中填充自己。
因为不需要存储元信息(类名,字段,类型,超类),所以这会产生更紧凑的二进制文件,直接的随机访问和更高的性能。






一些好的读法:








Hadoop可写







To implement Writable interface, object can be serialized in Hadoop. So what are the connections and differences between Hadoop Writable and java.io.serialization?

解决方案

Underlying storage differences:

Java Serializable

Serializable does not assume the class of stored values is known and tags instances with its class ie. it writes the metadata about the object, which includes the class name, field names and types, and its superclass. ObjectOutputStream and ObjectInputStream optimize this somewhat, so that 5-byte handles are written for instances of a class after the first. But object sequences with handles cannot be then accessed randomly, since they rely on stream state. This complicates things like sorting.

Hadoop Writable

While defining a "Writable", you know the expected class. So Writables don't store their type in the serialized representation as while deserializing, you know what is expected. eg. if the input key is a LongWritable, so an empty LongWritable instance is asked to populate itself from the input data stream.As no meta info needs to be stored (classname, fields, their types, super classes) is done, this results in considerably more compact binary files, straightforward random access and higher performance.


Some good reads:

For Java Serializable:

Hadoop Writable

这篇关于Hadoop Writable和java.io.serialization之间的连接和区别是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-21 23:50