问题描述
版本:
- Python 3.5.1
- Django 1.10
- mysqlclient 1.3.10
- mysql 5.7.18-0ubuntu0.16.04.1(Ubuntu)
- Linux Mint 18.1
我有一个很大的Django项目,其中有一个安装脚本,该脚本将一些csv文件中的内容添加到数据库中.偶尔,我需要重置所有内容,然后重新添加这些文件中的所有内容.一旦添加数据,还需要进行一些后期处理.但是,这会花费一些时间,因为文件很长,并且代码中不可避免地存在双循环以及许多数据库查询.
I have a large Django project where there's a setup script that adds a bunch of content to the database from some csv files. Once in a while, I need to reset everything, and re-add everything from these files. The data furthermore requires some post-processing once added. This however takes a while because the files are long and there's some unavoidable double loops in the code as well as many database queries.
在许多情况下,这些任务是独立的,因此应该可以并行运行.我四处寻找并行处理库,并决定使用非常简单的 多重处理 .
In many cases, the tasks are independent, and thus they should be possible to run in parallel. I looked around for parallel processing libraries and decided to use the very simple multiprocessing.
因此,设置非常简单.我们定义了一些要并行运行的函数,然后调用Pool
.简化代码:
Thus, the setup is quite simple. We define some function to run in parallel, and then call Pool
. Simplified code:
def some_func(input):
#code inserting data into Django here
pass
with Pool(4) as p:
p.map(some_func, [1, 2, 3, 4])
但是,运行代码会导致数据库连接错误,例如在此处,此处,此处:
However, running the code results in database connection errors like these reported here, here, here:
_mysql_exceptions.OperationalError: (2013, 'Lost connection to MySQL server during query')
似乎不同的线程/内核试图共享一个连接,或者连接可能没有传递给工作线程.
It seems like the different threads/cores are trying to share one connection, or maybe the connection is not passed on to the workers.
如何使并行处理与Django数据库操作一起使用?
How do I get parallel processing to work with Django database actions?
推荐答案
谷歌搜索之后,我能够在 Django Google组:
After googling around, I was able to find an old (2009) related question on the Django Google groups:
因此,我的Process.start()
调用了一个以以下内容开头的函数:
So, my Process.start()
calls a function which starts with:
from django.db import connection
connection.close()
这解决了我的问题.
因此,要解决此问题,请将函数更改为以下形式:
Thus, to solve the issue, change the function to be something like this:
def some_func(input):
#kill old database connection
from django.db import connection
connection.close()
#code inserting data into Django here
pass
然后工作正常.
这篇关于Django和并行处理:的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!