本文介绍了Django和并行处理:的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

版本:

  • Python 3.5.1
  • Django 1.10
  • mysqlclient 1.3.10
  • mysql 5.7.18-0ubuntu0.16.04.1(Ubuntu)
  • Linux Mint 18.1

我有一个很大的Django项目,其中有一个安装脚本,该脚本将一些csv文件中的内容添加到数据库中.偶尔,我需要重置所有内容,然后重新添加这些文件中的所有内容.一旦添加数据,还需要进行一些后期处理.但是,这会花费一些时间,因为文件很长,并且代码中不可避免地存在双循环以及许多数据库查询.

I have a large Django project where there's a setup script that adds a bunch of content to the database from some csv files. Once in a while, I need to reset everything, and re-add everything from these files. The data furthermore requires some post-processing once added. This however takes a while because the files are long and there's some unavoidable double loops in the code as well as many database queries.

在许多情况下,这些任务是独立的,因此应该可以并行运行.我四处寻找并行处理库,并决定使用非常简单的 多重处理 .

In many cases, the tasks are independent, and thus they should be possible to run in parallel. I looked around for parallel processing libraries and decided to use the very simple multiprocessing.

因此,设置非常简单.我们定义了一些要并行运行的函数,然后调用Pool.简化代码:

Thus, the setup is quite simple. We define some function to run in parallel, and then call Pool. Simplified code:

def some_func(input):
    #code inserting data into Django here
    pass

with Pool(4) as p:
    p.map(some_func, [1, 2, 3, 4])

但是,运行代码会导致数据库连接错误,例如在此处此处此处:

However, running the code results in database connection errors like these reported here, here, here:

_mysql_exceptions.OperationalError: (2013, 'Lost connection to MySQL server during query')

似乎不同的线程/内核试图共享一个连接,或者连接可能没有传递给工作线程.

It seems like the different threads/cores are trying to share one connection, or maybe the connection is not passed on to the workers.

如何使并行处理与Django数据库操作一起使用?

How do I get parallel processing to work with Django database actions?

推荐答案

谷歌搜索之后,我能够在 Django Google组:

After googling around, I was able to find an old (2009) related question on the Django Google groups:

因此,我的Process.start()调用了一个以以下内容开头的函数:

So, my Process.start() calls a function which starts with:

from django.db import connection

connection.close()

这解决了我的问题.

因此,要解决此问题,请将函数更改为以下形式:

Thus, to solve the issue, change the function to be something like this:

def some_func(input):
    #kill old database connection
    from django.db import connection
    connection.close()

    #code inserting data into Django here
    pass

然后工作正常.

这篇关于Django和并行处理:的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-19 17:51