我正在尝试调整我的 WS 以支持 ~ 20k 并发用户。

无论我更改什么配置,当我的测试遇到 2(2)k 个用户和各种 502/504 错误时,我仍然获得相同的 6 秒平均响应时间/每个端点。

网络服务:

CloudFlare Nginx Gunicorn Django/DRF Memcache Postgres

这是我尝试过的:

  • 将 gunicorn worker 从 4 增加到 10
  • 将 service(pod) 实例从 3 个增加到 10 个
  • 将 gunicorn worker 超时增加到 120
  • 将 Nginx proxy_pass 超时增加到 120

  • 大多数端点每 100 秒访问一次数据库,其他请求从内存缓存中获取数据。

    任何人都可以通过指出我应该更改哪种配置来提供帮助吗?

    我应该在哪里寻找延迟/瓶颈?

    Gunicorn worker 显然正在超时,我不理解这一点,因为 WS View 中没有逻辑。它应该只从 memcache 获取查询并返回它。

    Nginx 日志:
    latforms/android HTTP/1.1", upstream: "http://10.0.1.17:9090/endpoints/platforms/android", host: "myhost.co"
    2018/08/13 23:43:25 [error] 8893#8893: *2809163 upstream timed out (110: Connection timed out) while connecting to upstream, client: 200.211.198.133, server: myhost.co, request: "GET /endpoints/store/products/729 HTTP/1.1", upstream: "http://10.0.1.18:9090/endpoints/store/products/729", host: "myhost.co"
    200.211.198.133 - [200.211.198.133] - - [13/Aug/2018:23:43:25 +0000] "GET /endpoints/store/categories/?cat_pk=13081 HTTP/1.1" 200 1718 "-" "python-requests/2.18.4" 627 80.840 [production-service-api-80] 10.0.0.112:9090, 10.0.1.13:9090, 10.0.0.113:9090 0, 0, 11150 40.000, 40.000, 0.840 504, 504, 200
    200.211.198.133 - [200.211.198.133] - - [13/Aug/2018:23:43:25 +0000] "GET /endpoints/store/categories/?cat_pk=13081 HTTP/1.1" 200 1718 "-" "python-requests/2.18.4" 689 80.857 [production-service-api-80] 10.0.0.112:9090, 10.0.1.12:9090, 10.0.0.113:9090 0, 0, 11150 40.000, 40.000, 0.857 504, 504, 200
    200.211.198.133 - [200.211.198.133] - - [13/Aug/2018:23:43:25 +0000] "GET /endpoints/store/home/ HTTP/1.1" 200 10072 "-" "python-requests/2.18.4" 670 80.580 [production-service-api-80] 10.0.1.13:9090, 10.0.1.11:9090, 10.0.0.112:9090 0, 0, 66511 40.001, 40.002, 0.577 504, 504, 200
    200.211.198.133 - [200.211.198.133] - - [13/Aug/2018:23:43:25 +0000] "GET /endpoints/store/products/691/ HTTP/1.1" 200 703 "-" "python-requests/2.18.4" 646 80.486 [production-service-api-80] 10.0.1.8:9090, 10.0.1.13:9090, 10.0.1.12:9090 0, 0, 1968 40.000, 40.000, 0.486 504, 504, 200
    200.211.198.133 - [200.211.198.133] - - [13/Aug/2018:23:43:25 +0000] "GET /endpoints/store/products/5458 HTTP/1.1" 301 0 "-" "python-requests/2.18.4" 678 80.444 [production-service-api-80] 10.0.1.13:9090, 10.0.1.12:9090, 10.0.1.17:9090 0, 0, 0 40.000, 40.002, 0.442 504, 504, 301
    ....
    90, 10.0.1.11:9090, 10.0.1.8:9090 0, 0, 1968 40.000, 40.000, 0.584 504, 504, 200
    200.211.198.133 - [200.211.198.133] - - [13/Aug/2018:23:43:25 +0000] "GET /endpoints/store/products/5458/ HTTP/1.1" 200 241 "-" "python-requests/2.18.4" 647 80.709 [production-service-api-80] 10.0.0.113:9090, 10.0.1.8:9090, 10.0.0.112:9090 0, 0, 327 40.001, 40.000, 0.708 504, 504, 200
    --
    2018/08/13 23:43:25 [error] 8766#8766: *2809243 upstream timed out (110: Connection timed out) while connecting to upstream, client: 200.211.198.133, server: myhost.co, request: "GET /endpoints/store/categories/?cat_pk=13081 HTTP/1.1", upstream: "http://10.0.1.13:9090/endpoints/store/categories/?cat_pk=13081", host: "myhost.co"
    200.211.198.133 - [200.211.198.133] - - [13/Aug/2018:23:43:25 +0000] "GET /endpoints/store/products/692 HTTP/1.1" 301 0 "-" "python-requests/2.18.4" 677 80.672 [production-service-api-80] 10.0.1.17:9090, 10.0.1.10:9090, 10.0.0.113:9090 0, 0, 0 40.001, 40.001, 0.670 504, 504, 301
    200.211.198.133 - [200.211.198.133] - - [13/Aug/2018:23:43:25 +0000] "GET /endpoints/store/products/4608/ HTTP/1.1" 200 553 "-" "python-requests/2.18.4" 647 80.591 [production-service-api-80] 10.0.1.11:9090, 10.0.1.17:9090, 10.0.1.8:9090 0, 0, 1090 40.000, 40.003, 0.588 504, 504, 200
    

    unicorn 日志:
    {"asctime": "2018-08-13 23:42:55,145", "name": "gunicorn.access", "levelname": "INFO", "message": "10.0.0.13 - - [13/Aug/2018:23:42:55 +0000] \"GET /endpoints/store/products/691/ HTTP/1.1\" 200 1968 \"-\" \"python-requests/2.18.4\""}
    {"asctime": "2018-08-13 23:42:55,167", "name": "gunicorn.access", "levelname": "INFO", "message": "10.0.0.13 - - [13/Aug/2018:23:42:55 +0000] \"GET /endpoints/store/products/729 HTTP/1.1\" 301 - \"-\" \"python-requests/2.18.4\""}
    [2018-08-13 23:42:55 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:36)
    [2018-08-13 23:42:55 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:37)
    [2018-08-13 23:42:55 +0000] [382] [INFO] Booting worker with pid: 382
    [2018-08-13 23:42:55 +0000] [383] [INFO] Booting worker with pid: 383
    {"asctime": "2018-08-13 23:42:55,403", "name": "gunicorn.access", "levelname": "INFO", "message": "10.0.0.13 - - [13/Aug/2018:23:42:55 +0000] \"GET /endpoints/store/products/691/ HTTP/1.1\" 200 1968 \"-\" \"python-requests/2.18.4\""}
    ....
    {"asctime": "2018-08-13 23:42:55,184", "name": "gunicorn.access", "levelname": "INFO", "message": "10.0.0.13 - - [13/Aug/2018:23:42:55 +0000] \"GET /endpoints/store/categories/?cat_pk=13081 HTTP/1.1\" 200 11150 \"-\" \"python-requests/2.18.4\""}
    {"asctime": "2018-08-13 23:42:55,262", "name": "gunicorn.access", "levelname": "INFO", "message": "10.0.0.13 - - [13/Aug/2018:23:42:55 +0000] \"GET /endpoints/platforms/android HTTP/1.1\" 200 48 \"-\" \"python-requests/2.18.4\""}
    {"asctime": "2018-08-13 23:42:55,439", "name": "gunicorn.access", "levelname": "INFO", "message": "10.0.0.13 - - [13/Aug/2018:23:42:55 +0000] \"GET /endpoints/platforms/android HTTP/1.1\" 200 48 \"-\" \"python-requests/2.18.4\""}
    --
    [2018-08-13 23:42:56 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:31)
    {"asctime": "2018-08-13 23:42:56,689", "name": "gunicorn.access", "levelname": "INFO", "message": "10.0.0.13 - - [13/Aug/2018:23:42:56 +0000] \"GET /endpoints/store/products/729/ HTTP/1.1\" 200 2163 \"-\" \"python-requests/2.18.4\""}
    {"asctime": "2018-08-13 23:42:56,799", "name": "gunicorn.access", "levelname": "INFO", "message": "10.0.0.13 - - [13/Aug/2018:23:42:56 +0000] \"GET /endpoints/store/products/5458/ HTTP/1.1\" 200 327 \"-\" \"python-requests/2.18.4\""}
    

    最佳答案

    你为什么不使用uwsgi?

    为了更好地工作,请这样做

  • 减少代码中的数据库命中
  • 增加 gunicorn 的 worker 数量
  • 禁用 gunicorn 和 nginx 的信息日志记录

  • 如果这些配置对您不起作用,您必须更改设置配置或增加服务器资源。

    关于django - Nginx + Gunicorn + Django 高延迟,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/51832478/

    10-16 03:59