结合龙卷风gen.coroutine和JOBLIB mem.cache装饰

本文介绍了结合龙卷风gen.coroutine和JOBLIB mem.cache装饰的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

想象一下，有一个功能，它可以处理一个繁重的计算工作，我们希望在龙卷风应用程序上下文异步执行。此外，我们想懒洋洋地评价功能，通过存储其结果到磁盘，而不是重新运行功能两次同样的论点。

Imagine having a function, which handles a heavy computational job, that we wish to execute asynchronously in a Tornado application context. Moreover, we would like to lazily evaluate the function, by storing its results to the disk, and not rerunning the function twice for the same arguments.

没有缓存的结果（memoization的）人会做到以下几点：

Without caching the result (memoization) one would do the following:

def complex_computation(arguments):
    ...
    return result

@gen.coroutine
def complex_computation_caller(arguments):
    ...
    result = complex_computation(arguments)
    raise gen.Return(result)

假设实现的功能记忆化，我们选择的内存从 JOBLIB 类。通过简单地用 @ mem.cache 功能可以很容易地memoized装饰功能：

Assume to achieve function memoization, we choose Memory class from joblib. By simply decorating the function with @mem.cache the function can easily be memoized:

@mem.cache
def complex_computation(arguments):
    ...
    return result

其中，存储可以像存储=内存（cachedir = get_cache_dir（））。

现在考虑将两者结合，在这里我们对执行人执行复杂的计算功能：

Now consider combining the two, where we execute the computationally complex function on an executor:

class TaskRunner(object):
    def __init__(self, loop=None, number_of_workers=1):
        self.executor = futures.ThreadPoolExecutor(number_of_workers)
        self.loop = loop or IOLoop.instance()

    @run_on_executor
    def run(self, func, *args, **kwargs):
        return func(*args, **kwargs)

mem = Memory(cachedir=get_cache_dir())
_runner = TaskRunner(1)

@mem.cache
def complex_computation(arguments):
    ...
    return result

@gen.coroutine
def complex_computation_caller(arguments):
    result = yield _runner.run(complex_computation, arguments)
    ...
    raise gen.Return(result)

所以，第一个问题是上述做法是否是技术上是正确的？

So the first question is whether the aforementioned approach is technically correct?

现在让我们考虑以下情形：

Now let's consider the following scenario:

@gen.coroutine
def first_coroutine(arguments):
    ...
    result = yield second_coroutine(arguments)
    raise gen.Return(result)

@gen.coroutine
def second_coroutine(arguments):
    ...
    result = yield third_coroutine(arguments)
    raise gen.Return(result)

第二个问题是，如何能够memoize的 second_coroutine ？难道是正确的做一些事情，如：

The second question is how one can memoize second_coroutine? Is it correct to do something like:

@gen.coroutine
def first_coroutine(arguments):
    ...
    mem = Memory(cachedir=get_cache_dir())
    mem_second_coroutine = mem(second_coroutine)
    result = yield mem_second_coroutine(arguments)
    raise gen.Return(result)

@gen.coroutine
def second_coroutine(arguments):
    ...
    result = yield third_coroutine(arguments)
    raise gen.Return(result)

[更新I] 讨论使用 functools.lru_cache 或 repoze.lru.lru_cache 作为为解决第二个问题。

[UPDATE I] Caching and reusing a function result in Tornado discusses using functools.lru_cache or repoze.lru.lru_cache as a solution for second question.

推荐答案

的未来对象由龙卷风协同程序返回的可重复使用，因此它通常能够使用内存高速缓存如 functools.lru_cache ，如this问题。只是一定要放在缓存装饰前 @ gen.coroutine 。

The Future objects returned by Tornado coroutines are reusable, so it generally works to use in-memory caches such as functools.lru_cache, as explained in this question. Just be sure to put the caching decorator before @gen.coroutine.

在磁盘缓存（这似乎是由 cachedir 参数来暗示内存）是棘手的，因为未来对象一般不能写入磁盘。你的 TaskRunner 例子应该工作，但它做一些和别人完全不同的，因为 complex_calculation 不是协程。你的最后一个例子是行不通的，因为它试图把未来对象在缓存中。

On-disk caching (which seems to be implied by the cachedir argument to Memory) is trickier, since Future objects cannot generally be written to disk. Your TaskRunner example should work, but it's doing something fundamentally different from the others because complex_calculation is not a coroutine. Your last example will not work, because it's trying to put the Future object in the cache.

相反，如果你希望缓存的东西有装饰，你需要一个包装与第二协程内协程一个装饰。事情是这样的：

Instead, if you want to cache things with a decorator, you'll need a decorator that wraps the inner coroutine with a second coroutine. Something like this:

def cached_coroutine(f):
    @gen.coroutine
    def wrapped(*args):
        if args in cache:
            return cache[args]
        result = yield f(*args)
        cache[args] = f
        return result
    return wrapped

这篇关于结合龙卷风gen.coroutine和JOBLIB mem.cache装饰的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！