多个Docker容器和Celery

多个Docker容器和Celery

本文介绍了多个Docker容器和Celery的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们现在具有以下项目结构:

We have the following structure of the project right now:


  1. 处理来自客户端的传入请求的Web服务器。

  2. 向用户提供一些建议的Google Analytics(分析)模块。

我们决定使这些模块完全独立并且将它们移动到不同的Docker容器中。当来自用户的查询到达网络服务器时,它会向分析模块发送另一个查询以获取建议。

We decided to keep these modules completely independent and move them to different docker containers. When a query from a user arrives to the web-server it sends another query to the analytics module to get the recommendations.

为使建议保持一致,我们需要做一些定期进行后台计算,例如当新用户在我们的系统中注册时。此外,一些后台任务还纯粹与Web服务器逻辑相关联。为此,我们决定使用分布式任务队列,例如Celery。

For recommendations to be consistent we need to do some background calculations periodically and when, for instance, new users register within our system. Also some background tasks are connected purely with the web-server logic. For this purposes we decided to use a distributed task queue, e.g., Celery.

有以下几种可能的任务创建和执行方案:

There are following possible scenarios of task creation and execution:


  1. 在网络服务器上排队的任务,在网络服务器上执行(例如,处理上传的图像)

  2. 在网络服务器上排队的任务Web服务器,在分析模块上执行(例如,计算对新用户的建议)

  3. 在分析模块上排队并在其中执行的任务(例如,定期更新)

到目前为止,我看到3种在这里使用Celery的怪异可能性:

So far I see 3 rather weird possibilities to use Celery here:

我。芹菜放在单独的容器中并执行所有操作


  1. 将芹菜移到单独的Docker容器中。

  2. 从网络服务器和分析工具中提供所有必要的程序包以执行任务。

  3. 与其他容器共享任务代码(或在网络服务器和分析工具中声明虚拟任务)
  4. li>
  1. Move Celery to the separate docker container.
  2. Provide all of the necessary packages from both web-server and analytics to execute tasks.
  3. Share tasks code with other containers (or declare dummy tasks at web-server and analytics)

通过这种方式,我们失去了隔离性,因为该功能由芹菜容器和其他容器共享。

This way, we loose isolation, as the functionality is shared by Celery container and other containers.

II。芹菜放在单独的容器中,作用更小

I 相同,但现在的任务只是向Web服务器和分析请求模块,在那里异步处理,并在任务内部轮询结果,直到准备就绪为止。

Same as I, but tasks now are just requests to web-server and analytics module, which are handled asynchronously there, with the result polled inside the task until it is ready.

通过这种方式,我们从经纪人那里获得收益,但是所有繁重的计算都从芹菜工人那里转移过来。

This way, we get benefits from having the broker, but all heavy computations are moved from Celery workers.

III。每个容器中都有单独的芹菜


  1. 在网络服务器和分析模块中同时运行Celery。

  2. 将虚拟任务声明(包含分析任务)添加到Web服务器。

  3. 添加2个任务队列,一个用于Web服务器,一个用于分析。

这样,可以在分析模块中执行计划在Web服务器上执行的任务。但是,仍然必须跨容器共享任务代码或使用虚拟任务,此外,还需要在每个容器中运行芹菜工人。

This way, tasks scheduled at web-server could be executed in analytics module. However, still have to share code of tasks across the containers or use dummy tasks, and, additionally, need to run celery workers in each container.

什么是最好的方法为此,还是应该完全改变逻辑,例如将所有内容移入一个容器?

What is the best way to do this, or the logic should be changed completely, e.g., move everything inside one container?

推荐答案

首先,让我们澄清一下celery库(通过 pip安装或在 setup.py 中获得的库)与celery工作器之间的区别-这是从代理中取出任务并对其进行处理的实际过程。当然,您可能想要多个 workers /进程(例如,用于将不同的任务分离给不同的worker)。

First, let clarify the difference between celery library (which you get with pip install or in your setup.py) and celery worker - which is the actual process that dequeue tasks from the broker and handle them. Of course you might wanna have multiple workers/processes (for separating different task to a different worker - for example).

假设您有两个任务: calculate_recommendations_task periodic_update_task ,您想在单独的工作程序上运行它们,即 recommendation_worker periodic_worker
另一个过程是芹菜拍打,它只是每 periodic_update_task 入队到经纪人中。

Lets say you have two tasks: calculate_recommendations_task and periodic_update_task and you want to run them on a separate worker i.e recommendation_worker and periodic_worker.Another process will be celery beat which just enqueue the periodic_update_task into the broker each x hours.

此外,假设您使用。

我假设您想使用celery broker&也在docker后端,我将选择推荐的celery用法-作为代理,并作为后端。

I'll assume you want to use celery broker & backend with docker too and I'll pick the recommended usage of celery - RabbitMQ as broker and Redis as backend.

因此,我们现在有6个容器,我将它们写成在 docker-compose.yml 中:

So now we have 6 containers, I'll write them in a docker-compose.yml:

version: '2'
services:
  rabbit:
    image: rabbitmq:3-management
    ports:
      - "15672:15672"
      - "5672:5672"
    environment:
      - RABBITMQ_DEFAULT_VHOST=vhost
      - RABBITMQ_DEFAULT_USER=guest
      - RABBITMQ_DEFAULT_PASS=guest
  redis:
    image: library/redis
    command: redis-server /usr/local/etc/redis/redis.conf
    expose:
      - "6379"
    ports:
      - "6379:6379"
  recommendation_worker:
    image: recommendation_image
    command: celery worker -A recommendation.celeryapp:app -l info -Q recommendation_worker -c 1 -n recommendation_worker@%h -Ofair
  periodic_worker:
    image: recommendation_image
    command: celery worker -A recommendation.celeryapp:app -l info -Q periodic_worker -c 1 -n periodic_worker@%h -Ofair
  beat:
    image: recommendation_image
    command: <not sure>
  web:
    image: web_image
    command: python web_server.py

两个构建了 recommendation_image web_image 的dockerfiles都应该安装celery 。只有 recommendation_image 应该具有任务代码,因为工作人员将要处理这些任务:

both dockerfiles, which builds the recommendation_image and the web_image should install celery library. Only the recommendation_image should have the tasks code because the workers are going to handle those tasks:

RecommendationDockerfile :

FROM python:2.7-wheezy
RUN pip install celery
COPY tasks_src_code..

WebDockerfile:

FROM python:2.7-wheezy
RUN pip install celery
RUN pip install bottle
COPY web_src_code..

其他图像( rabbitmq:3-management & library / redis 可从docker hub获得,当您运行 docker-compose up )时,它们将被自动拉出。

The other images (rabbitmq:3-management & library/redis are available from docker hub and they will be pulled automatically when you run docker-compose up).

现在是这样:在Web服务器中,您可以按其字符串名称触发celery任务,并按任务ID提取结果(不共享代码) web_server.py

Now here is the thing: In you web server you can trigger celery tasks by their string name and pull the result by task-ids (without sharing the code) web_server.py:

import bottle
from celery import Celery
rabbit_path = 'amqp://guest:guest@rabbit:5672/vhost'
celeryapp = Celery('recommendation', broker=rabbit_path)
celeryapp.config_from_object('config.celeryconfig')

@app.route('/trigger_task', method='POST')
def trigger_task():
    r = celeryapp.send_task('calculate_recommendations_task', args=(1, 2, 3))
    return r.id

@app.route('/trigger_task_res', method='GET')
def trigger_task_res():
    task_id = request.query['task_id']
    result = celery.result.AsyncResult(task_id, app=celeryapp)
    if result.ready():
        return result.get()
    return result.state

最后一个文件 config.celeryconfig.py

CELERY_ROUTES = {
    'calculate_recommendations_task': {
        'exchange': 'recommendation_worker',
        'exchange_type': 'direct',
        'routing_key': 'recommendation_worker'
    }
}
CELERY_ACCEPT_CONTENT = ['pickle', 'json', 'msgpack', 'yaml']

这篇关于多个Docker容器和Celery的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-05 01:07