Engine上执行长任务

Engine上执行长任务

本文介绍了如何在使用gunicorn的Google App Engine上执行长任务?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

限时删除!!

GAE flex默认情况下使用gunicorn作为入口点,这很好,除了我有一个功能需要花费很长时间来处理(在db中抓取网站和故事数据),并且gunicorn默认在30秒后超时,然后一个新的工人从头开始执行任务,依此类推.

我可以将Gunicorn超时设置为20分钟左右,但这似乎并不优雅.有什么方法可以在gunicorn外部运行这些后端功能,或者可能是我没有考虑过的gunicorn配置?没有客户端,因此完成很长时间不是问题.

我的 app.yaml 文件当前如下所示:

runtime: python
env: flex
entrypoint: gunicorn -b :$PORT main:app

runtime_config:
  python_version: 2

# This sample incurs costs to run on the App Engine flexible environment.
# The settings below are to reduce costs during testing and are not appropriate
# for production use. For more information, see:
# https://cloud.google.com/appengine/docs/flexible/python/configuring-your app-with-app-yaml
manual_scaling:
  instances: 1
resources:
  cpu: 1
  memory_gb: 3
  disk_size_gb: 10
解决方案

您可以使用异步工作程序类,则无需将超时设置为20分钟.默认的工作程序类是sync. 此处.. >

使用eventlet异步工作程序(如果使用Google客户端库,则不建议使用gevent)

pip install eventlet

然后在您的gunicorn实例中将worker-class ='eventlet'设置为,并将worker的数量设置为[cores] x 2 +1(这只是 Google文档).例如:

CMD exec gunicorn --worker-class eventlet --workers 3 -b :$PORT main:app

Gunicorn工作人员配置

或者,使用此处使用pubsub和worker.

GAE flex uses gunicorn as an entrypoint by default which is fine, except I have a function that takes a very long time to process (scraping websites and story data in a db) and gunicorn times out at 30 seconds by default, then a new worker starts all over on the task, and so on and so forth.

I can set the gunicorn timeout to something like 20 minutes, but it doesn't seem graceful. Is there any way to run these backend functions "outside" of gunicorn, or perhaps a gunicorn config I'm not thinking about? There is no client side, so the long time to complete isn't an issue.

My app.yaml file currently looks like this:

runtime: python
env: flex
entrypoint: gunicorn -b :$PORT main:app

runtime_config:
  python_version: 2

# This sample incurs costs to run on the App Engine flexible environment.
# The settings below are to reduce costs during testing and are not appropriate
# for production use. For more information, see:
# https://cloud.google.com/appengine/docs/flexible/python/configuring-your app-with-app-yaml
manual_scaling:
  instances: 1
resources:
  cpu: 1
  memory_gb: 3
  disk_size_gb: 10
解决方案

You can use async worker-class and then you won't need to set the timeout to 20 minutes. The default worker class is sync. Docs regarding the workers here.

Use the eventlet async worker (gevent not recommended if using google client libraries)

pip install eventlet

Then in your gunicorn instantiation set the worker-class = 'eventlet' and set number of workers to [number of cores] x 2 +1 (that's just a recommendation in google docs).For example:

CMD exec gunicorn --worker-class eventlet --workers 3 -b :$PORT main:app

Gunicorn Worker Configuration

Alternatively, use implementation described here using pubsub and workers.

这篇关于如何在使用gunicorn的Google App Engine上执行长任务?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

1403页,肝出来的..

09-06 14:03