问题描述
任务上传相同的(我简化了一下),我有一个很好的标准Django + Rabbitmq + Celery设置,有一点)大文件(〜100MB)与多个远程PC异步。
所有这些都是正常的,但费用是使用大量的内存,因为每个任务/工作将该大文件加载到内存中。
我想做的是具有某种缓存,可访问所有任务,即只加载一次文件。基于locmem的Django缓存是完美的,但是像文档所说:每个进程都有自己的私有缓存实例,所有工作人员都可以访问这个缓存。
试图播放Celery信号,如,但这不是我需要的。
所以问题是:有没有办法可以在Celery中定义全局的东西(就像基于dict的类,我可以加载文件或smth)。或者在这种情况下可以使用Django技巧吗?
谢谢。
为什么不简单地从磁盘流式传输上传,而不是将整个文件加载到内存中?
I have pretty standard Django+Rabbitmq+Celery setup with 1 Celery task and 5 workers.
Task uploads the same (I simplify a bit) big file (~100MB) asynchronously to a number of remote PCs.
All is working fine at the expense of using lots of memory, since every task/worker load that big file into memory separatelly.
What I would like to do is to have some kind of cache, accessible to all tasks, i.e. load the file only once. Django caching based on locmem would be perfect, but like documentation says: "each process will have its own private cache instance" and I need this cache accessible to all workers.
Tried to play with Celery signals like described in #2129820, but that's not what I need.
So the question is: is there a way I can define something global in Celery (like a class based on dict, where I could load the file or smth). Or is there a Django trick I could use in this situation ?
Thanks.
Why not simply stream the upload(s) from disk instead of loading the whole file in memory ?
这篇关于Django中所有Celery工作者/内存缓存中的全局可访问对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!