本文介绍了如何在数据流中使用内存缓存?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在dafalow ParDo中使用Memcache吗?任何想法如何?我不能使用现有的memcahse lib,因为它们属于appengine并且不可序列化.罗希特(Rohit)

I would like to use Memcache in dafalow ParDo? any ideas how?I can't use existing memcahse lib as they belong to appengine and are not serializable.Rohit

推荐答案

我的猜测是,您在DoFn类型的DoFn中有一个私有变量(如果我的猜测是错误的,请编辑您的问题以包括DoFn的代码).

My guess is, you have a private variable in your DoFn of type MemcacheServiceImpl (if my guess is wrong, please edit your question to include the code of your DoFn).

实际上,当您提交管道时,Dataflow会序列化您的DoFn,并在工作程序上反序列化它们.处理此问题的正确方法是使变量成为瞬态变量,并懒惰地对其进行初始化:

Indeed, Dataflow serializes your DoFn's when you submit the pipeline and de-serializes them on the workers. The proper way to handle this is to make the variable transient, and initialize it lazily:

class MyDoFn extends DoFn<..., ...> {
  private transient MemcacheService memcache;
  private MemcacheService getMemcache() {
    if (memcache == null) {
      memcache = MemcacheServiceFactory.getMemcacheService();
      ...
    }
  }

  public void process(...) {
    ...use getMemcache()...
  }
}

还要注意,要从非AppEngine环境访问AppEngine API(包括Memcache),您应该使用远程API .

Also note that to access AppEngine APIs, including Memcache, from a non-AppEngine environment, you should use the Remote API.

这篇关于如何在数据流中使用内存缓存?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-21 01:17