的序列化是如何工作的

的序列化是如何工作的

本文介绍了Java 的序列化是如何工作的,什么时候应该使用它来代替其他一些持久化技术?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近一直在努力学习更多,并且通常测试 Java 的工作和个人项目的序列化,我必须说我对它了解得越多,我就越不喜欢它.这可能是由错误信息引起的,所以这就是为什么我要向你们所有人询问这两件事:

I've been lately trying to learn more and generally test Java's serialization for both work and personal projects and I must say that the more I know about it, the less I like it. This may be caused by misinformation though so that's why I'm asking these two things from you all:

1:在字节级别,序列化如何知道如何将序列化值与某个类匹配?

1: On byte level, how does serialization know how to match serialized values with some class?

我的问题之一是我对包含值一"、二"、三"的 ArrayList 进行了一个小测试.序列化后,字节数组占用了 78 个字节,这对于如此低的信息量(19+3+3+4 个字节)来说似乎非常多.当然肯定会有一些开销,但这导致了我的第二个问题:

One of my problems right here is that I made a small test with ArrayList containing values "one", "two", "three". After serialization the byte array took 78 bytes which seems awfully lot for such low amount of information(19+3+3+4 bytes). Granted there's bound to be some overhead but this leads to my second question:

2: 序列化可以被认为是持久化对象的好方法吗?现在很明显,如果我使用一些自制的 XML 格式,持久性数据将是这样的

2: Can serialization be considered a good method for persisting objects at all? Now obviously if I'd use some homemade XML format the persistence data would be something like this

<object>
    <class="java.util.ArrayList">
    <!-- Object array inside Arraylist is called elementData -->
    <field name="elementData">
        <value>One</value>
        <value>Two</value>
        <value>Three</value>
    </field>
</object>

它与一般的 XML 一样,有点臃肿,占用 138 个字节(即没有空格).JSON 中的相同可能是

which, like XML in general, is a bit bloated and takes 138 bytes(without whitespaces, that is). The same in JSON could be

{
    "java.util.ArrayList": {
        "elementData": [
            "one",
            "two",
            "three"
        ]
    }
}

它是 75 个字节,所以已经比 Java 的序列化略小了.对于这些基于文本的格式,显然必须有一种方法将您的基本数据表示为文本、数字或两者的任意组合.

which is 75 bytes so already slightly smaller than Java's serialization. With these text-based formats it's of course obvious that there has to be a way to represent your basic data as text, numbers or any combination of both.

所以回顾一下,序列化在字节/位级别是如何工作的,什么时候应该使用,什么时候不应该使用,除了 Java 中的标准之外,序列化还有什么真正的好处?

So to recap, how does serialization work on byte/bit level, when it should be used and when it shouldn't be used and what are real benefits of serialization besides that it comes standard in Java?

推荐答案

我个人会尽量避免 Java 的内置"序列化:

I would personally try to avoid Java's "built-in" serialization:

  • 不可移植到其他平台
  • 效率不高
  • 它很脆弱 - 让它处理一个类的多个版本有点棘手.除非您小心,否则即使更改编译器也会破坏序列化.

有关实际字节含义的详细信息,请参阅Java 对象序列化规范.

For details of what the actual bytes mean, see the Java Object Serialization Specification.

有多种选择,例如:

  • XML and JSON, as you've shown (various XML flavours, of course)
  • YAML
  • Facebook's Thrift (RPC as well as serialization)
  • Google Protocol Buffers
  • Hessian (web services as well as serialization)
  • Apache Avro
  • Your own custom format

(免责声明:我为 Google 工作,我正在将 Protocol Buffers 移植到 C# 作为我的 20% 项目,所以很明显我认为这是一项很好的技术:)

(Disclaimer: I work for Google, and I'm doing a port of Protocol Buffers to C# as my 20% project, so clearly I think that's a good bit of technology :)

出于显而易见的原因,跨平台格式几乎总是比特定于平台的格式更具限制性——例如,Protocol Buffers 的本机类型集非常有限——但互操作性非常有用.您还需要考虑版本控制的影响、向后和向前兼容性等.文本格式通常是可手动编辑的,但在空间和时间上往往效率较低.

Cross-platform formats are almost always more restrictive than platform-specific formats for obvious reasons - Protocol Buffers has a pretty limited set of native types, for example - but the interoperability can be incredibly useful. You also need to consider the impact of versioning, with backward and forward compatibility, etc. The text formats are generally hand-editable, but tend to be less efficient in both space and time.

基本上,您需要仔细查看您的要求.

Basically, you need to look at your requirements carefully.

这篇关于Java 的序列化是如何工作的,什么时候应该使用它来代替其他一些持久化技术?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-24 03:13