本文介绍了结合龙卷风gen.coroutine和JOBLIB mem.cache装饰的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!



Imagine having a function, which handles a heavy computational job, that we wish to execute asynchronously in a Tornado application context. Moreover, we would like to lazily evaluate the function, by storing its results to the disk, and not rerunning the function twice for the same arguments.


Without caching the result (memoization) one would do the following:

def complex_computation(arguments):
    return result

def complex_computation_caller(arguments):
    result = complex_computation(arguments)
    raise gen.Return(result)

假设实现的功能记忆化,我们选择的内存 JOBLIB 类。通过简单地用 @ mem.cache 功能可以很容易地memoized装饰功能:

Assume to achieve function memoization, we choose Memory class from joblib. By simply decorating the function with @mem.cache the function can easily be memoized:

def complex_computation(arguments):
    return result

其中,存储可以像存储=内存(cachedir = get_cache_dir())


Now consider combining the two, where we execute the computationally complex function on an executor:

class TaskRunner(object):
    def __init__(self, loop=None, number_of_workers=1):
        self.executor = futures.ThreadPoolExecutor(number_of_workers)
        self.loop = loop or IOLoop.instance()

    def run(self, func, *args, **kwargs):
        return func(*args, **kwargs)

mem = Memory(cachedir=get_cache_dir())
_runner = TaskRunner(1)

def complex_computation(arguments):
    return result

def complex_computation_caller(arguments):
    result = yield _runner.run(complex_computation, arguments)
    raise gen.Return(result)


So the first question is whether the aforementioned approach is technically correct?


Now let's consider the following scenario:

def first_coroutine(arguments):
    result = yield second_coroutine(arguments)
    raise gen.Return(result)

def second_coroutine(arguments):
    result = yield third_coroutine(arguments)
    raise gen.Return(result)

第二个问题是,如何能够memoize的 second_coroutine ?难道是正确的做一些事情,如:

The second question is how one can memoize second_coroutine? Is it correct to do something like:

def first_coroutine(arguments):
    mem = Memory(cachedir=get_cache_dir())
    mem_second_coroutine = mem(second_coroutine)
    result = yield mem_second_coroutine(arguments)
    raise gen.Return(result)

def second_coroutine(arguments):
    result = yield third_coroutine(arguments)
    raise gen.Return(result)

[更新I] 讨论使用 functools.lru_cache repoze.lru.lru_cache 作为为解决第二个问题。

[UPDATE I] Caching and reusing a function result in Tornado discusses using functools.lru_cache or repoze.lru.lru_cache as a solution for second question.


未来对象由龙卷风协同程序返回的可重复使用,因此它通常能够使用内存高速缓存如 functools.lru_cache ,如this问题。只是一定要放在缓存装饰前 @ gen.coroutine

The Future objects returned by Tornado coroutines are reusable, so it generally works to use in-memory caches such as functools.lru_cache, as explained in this question. Just be sure to put the caching decorator before @gen.coroutine.

在磁盘缓存(这似乎是由 cachedir 参数来暗示内存)是棘手的,因为未来对象一般不能写入磁盘。你的 TaskRunner 例子应该工作,但它做一些和别人完全不同的,因为 complex_calculation 不是协程。你的最后一个例子是行不通的,因为它试图把未来对象在缓存中。

On-disk caching (which seems to be implied by the cachedir argument to Memory) is trickier, since Future objects cannot generally be written to disk. Your TaskRunner example should work, but it's doing something fundamentally different from the others because complex_calculation is not a coroutine. Your last example will not work, because it's trying to put the Future object in the cache.


Instead, if you want to cache things with a decorator, you'll need a decorator that wraps the inner coroutine with a second coroutine. Something like this:

def cached_coroutine(f):
    def wrapped(*args):
        if args in cache:
            return cache[args]
        result = yield f(*args)
        cache[args] = f
        return result
    return wrapped

这篇关于结合龙卷风gen.coroutine和JOBLIB mem.cache装饰的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-05 11:19