本文介绍了确保Function参数可序列化的最佳方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个带有多个参数的可序列化类,包括Function:

public class Cls implements Serializable {
    private final Collection<String> _coll;
    private final Function<String, ?> _func;

    public Cls(Collection<String> coll, Function<String, ?> func) {
        _coll = coll;
        _func = func;
    }
}

func存储在成员变量中,因此需要可序列化. Java lambdas 如果可分配给它们的类型是可序列化的,则可以序列化.确保通过构造函数传递的Function是可序列化的最佳方法是什么(如果它是使用lambda创建的)?

  1. 创建一个SerializableFunction类型并使用该类型:

    public interface SerializableFunction<F, R> implements Function<F, R>, Serializable {}
    ....
    public Cls(Collection<String> coll, SerializableFunction<String, ?> func) {...}
    

    问题:

    • collfunc参数之间现在存在不匹配,因为在签名中将func声明为可序列化的,但不是coll,但都必须对其进行序列化才能起作用.
    • 不允许其他可序列化的Function实现.
  2. 在构造函数上使用类型参数:

    public <F extends Function<String, ?> & Serializable>
    Cls(Collection<String> coll, F func) {...}
    

    问题:

    • 比1更灵活,但更令人困惑.
    • 两个参数之间仍然存在不匹配-在编译时类型继承中要实现Serializable,需要func参数,但仅要求coll可序列化 (尽管可以根据需要取消此要求).

    编辑:当尝试使用lambda或方法引用进行调用时,此代码实际上不会编译.

  3. 留给呼叫者

    这要求调用者从javadocs(或反复试验)中知道该参数需要可序列化,并在适当时进行强制转换:

    Cls c = new Cls(strList, (Function<String, ?> & Serializable)s -> ...);
    

    Cls c = new Cls(strList, (Function<String, ?> & Serializable)Foo::processStr);
    

    这是一个丑陋的IMO,保证使用lambda的最初天真实现被破坏了,而不是像coll那样起作用(因为大多数集合都可以以某种方式可序列化).这还将类的实现详细信息推送到调用方.

目前,我倾向于选择2,因为这对调用者的负担最小,但是我认为这里没有理想的解决方案.其他有关如何正确执行此操作的建议吗?

编辑:也许需要一些背景.这是在 storm 内运行的类,该类经序列化以传输到删除集群,执行.当在集群上运行时,该函数正在对已处理的元组执行操作.因此,它是可序列化的并且函数参数是可序列化的,这是该类用途的很大一部分.如果不是,则该类根本不可用.

解决方案

在大多数情况下,答案是:不要.

您可能会注意到,大多数JRE类,甚至不会在其签名中强制使用Serializable.仅仅有太多的API并非专门针对Serialization,在这些API中,有关实现Serializable的对象的编译时信息会丢失,并且如果将序列化将其强制输入为Serializable,则将它们与Serialization一起使用将需要大量类型转换. /p>

由于您的参数之一是Collection,因此您可以从该API获取示例:

Collections.unmodifiableList :

您会发现更多此类操作,它们希望保留序列化功能而不在结果上保留Serializable编译时类型.

这也适用于所有非public类型,例如Collections.emptyList()Arrays.asList(…)Comparator.reverseOrder()的结果.他们都是Serializable,没有声明.


此外,每个类都有比仅被序列化更多的用例,应避免强制始终使用Serializable.这将阻碍不涉及序列化的用途.

关于Collection参数,您可以考虑完全删除可序列化的约束.通常,您可以保护自己的班级,避免以后对收到的收藏集进行更改.一个简单的解决方案是复制集合,并且在执行该操作时,可以使用支持序列化的类型.

即使您不想避免复制,序列化本身也是一个复制过程,因此您可以简单地创建自定义的readObjectwriteObject方法来存储Collection contents ,无需使用Serializable集合.


总结起来,通常的策略是如果您的类的用户打算序列化它的实例,则用户有责任将所有放入其中的组件都自己作为Serializable.

I'm writing a serializable class that takes several arguments, including a Function:

public class Cls implements Serializable {
    private final Collection<String> _coll;
    private final Function<String, ?> _func;

    public Cls(Collection<String> coll, Function<String, ?> func) {
        _coll = coll;
        _func = func;
    }
}

func is stored in a member variable, and so needs to be serializable. Java lambdas are serializable if the type they're being assigned to is serializable. What's the best way to ensure that the Function I get passed in my constructor is serializable, if it is created using a lambda?

  1. Create a SerializableFunction type and use that:

    public interface SerializableFunction<F, R> implements Function<F, R>, Serializable {}
    ....
    public Cls(Collection<String> coll, SerializableFunction<String, ?> func) {...}
    

    Issues:

    • There's now a mismatch between the coll and func arguments, in that func is declared as serializable in the signature, but coll is not, but both are required to be serializable for it to work.
    • It doesn't allow other implementations of Function that are serializable.
  2. Use a type parameter on the constructor:

    public <F extends Function<String, ?> & Serializable>
    Cls(Collection<String> coll, F func) {...}
    

    Issues:

    • More flexible than 1, but more confusing.
    • There's still a mismatch between the two arguments - the func argument is required to implement Serializable in the compile-time type heirarchy, but coll is just required to be serializable somehow (although this requirement can be cast away if required).

    EDIT This code doesn't actually compile when trying to call with a lambda or method reference.

  3. Leave it up to the caller

    This requires the caller to know (from the javadocs, or trial-and-error) that the argument needs to be serializable, and cast as appropriate:

    Cls c = new Cls(strList, (Function<String, ?> & Serializable)s -> ...);
    

    or

    Cls c = new Cls(strList, (Function<String, ?> & Serializable)Foo::processStr);
    

    This is ugly IMO, and the initial naive implementation of using a lambda is guaranteed to break, rather than likely to work as with coll (as most collections are serializable somehow). This also pushes an implementation detail of the class onto the caller.

At the moment I'm leaning towards option 2, as the one that imposes the least burden on the caller, but I don't think there's an ideal solution here. Any other suggestions for how to do this properly?

EDIT: Perhaps some background is required. This is a class that runs inside storm, in a bolt, which is serialized to transfer to a remove cluster to execute. The function is performing an operation on the processed tuples when run on the cluster. So it is very much part of the class's purpose that it is serializable and that the function argument is serializable. If it is not, then the class is not usable at all.

解决方案

In most cases the answer is: don’t.

You may notice that most classes of the JRE, even ObjectOutputStream.writeObject do not enforce Serializable in their signature. There are simply too many APIs not specifically to Serialization where the compile-time information about an object implementing Serializable gets lost and using them together with Serialization would require lots of type casting if the latter enforced their inputs to be Serializable.

Since one of your parameters is a Collection, you may get examples from that API:

Collections.unmodifiableList:

You will find more of these operations which care to retain the Serialization capability without retaining the Serializable compile-time type on the result.

This also applies to all non-public types, e.g. the results of Collections.emptyList(), Arrays.asList(…) and Comparator.reverseOrder(). They all are Serializable without declaring it.


Further, every class having more use cases than just getting serialized should refrain from enforcing to be always Serializable. That would hinder the uses where no Serialization is involved.

Regarding the Collection parameter, you may consider removing the serializable constraint at all. Normally, you protect your class against later-on changes to the collection you received. A simple solution is to copy the collection and when your doing it, you may use a type which supports Serialization.

Even if you want to avoid copying, the Serialization itself is a copying process per se, so you can simply create custom readObject and writeObject methods storing the contents of the Collection, eliminating the need to have a Serializable collection.


To summarize it, usually the policy is that if the user of your class intends to serialize instances of it, it’s the responsibility of the user that all components put into it are themselves Serializable.

这篇关于确保Function参数可序列化的最佳方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-26 08:11
查看更多