在Django中一次更新多个对象？

本文介绍了在Django中一次更新多个对象？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用Django 1.9。我有一个Django表，它按组织按月按原始值和百分位数表示特定度量的值：

  class MeasureValue（models.Model）：
 org = models.ForeignKey（Org，null = True，blank = True）
 month = models.DateField（）
 calc_value = models.FloatField（null = True，blank = True）
 percentile = models.FloatField（null = True，blank = True）

通常每月大约有10,000个。我的问题是我是否可以加快在模型上设置值的过程。

当前，我通过使用Django过滤查询检索一个月的所有度量值，将其转换为熊猫数据框，然后使用scipy的 rankdata 设置等级和百分位数。我这样做是因为pandas和 rankdata 是高效的，能够忽略空值，并且能够以我想要的方式处理重复的值，所以我对此方法感到满意：

 记录= MeasureValue.objects.filter（month = month）.values（）
 df = pd.DataFrame .from_records（records）
 //使用calc_value设置每行的百分位数，使用scipy的rankdata

  i，df.iterrows（ ）：
 mv = MeasureValue.objects.get（org = row.org，month = month）
如果（row.percentile为None）或np.isnan（row.percentile）：
 row.percentile =无
 mv.percentile = row.percentile 
 mv.save（）

这毫不奇怪地很慢。是否有任何有效的Django方法来提高速度，只需编写一个数据库即可，而不是成千上万个？我已经检查了文档，但看不到一个。

解决方案

原子事务可以减少在循环中花费的时间：

<$来自django.db的p $ p>

导入带有事务.atomic（）的事务
 
，其中$ i $，$ d $。 mv = MeasureValue.objects.get（org = row.org，month = month）
 
 if（row.percentile为None）或np.isnan（row.percentile）：
＃if它已经是None，为什么将其设置为None？ 
 row.percentile =无
 
 mv.percentile = row.percentile 
 mv.save（）

Django的默认行为是在自动提交模式下运行。除非事务处于活动状态，否则每个查询都会立即提交到数据库。

通过和transaction.atomic（）所有插入都分组为一个事务。提交事务所需的时间在所有随附的插入语句中摊销，因此每个插入语句的时间大大减少。

 
I am using Django 1.9. I have a Django table that represents the value of a particular measure, by organisation by month, with raw values and percentiles:
class MeasureValue(models.Model):
    org = models.ForeignKey(Org, null=True, blank=True)
    month = models.DateField()
    calc_value = models.FloatField(null=True, blank=True)
    percentile = models.FloatField(null=True, blank=True)
There are typically 10,000 or so per month. My question is about whether I can speed up the process of setting values on the models. 
Currently, I calculate percentiles by retrieving all the measurevalues for a month using a Django filter query, converting it to a pandas dataframe, and then using scipy's rankdata to set ranks and percentiles. I do this because pandas and rankdata are efficient, able to ignore null values, and able to handle repeated values in the way that I want, so I'm happy with this method:
records = MeasureValue.objects.filter(month=month).values()
df = pd.DataFrame.from_records(records)
// use calc_value to set percentile on each row, using scipy's rankdata
However, I then need to retrieve each percentile value from the dataframe, and set it back onto the model instances. Right now I do this by iterating over the dataframe's rows, and updating each instance:
for i, row in df.iterrows():
    mv = MeasureValue.objects.get(org=row.org, month=month)
    if (row.percentile is None) or np.isnan(row.percentile):
        row.percentile = None
    mv.percentile = row.percentile
    mv.save()
This is unsurprisingly quite slow. Is there any efficient Django way to speed it up, by making a single database write rather than tens of thousands? I have checked the documentation, but can't see one. 
 解决方案 
Atomic transactions can reduce the time spent in the loop:
from django.db import transaction

with transaction.atomic():
    for i, row in df.iterrows():
        mv = MeasureValue.objects.get(org=row.org, month=month)

        if (row.percentile is None) or np.isnan(row.percentile):
            # if it's already None, why set it to None?
            row.percentile = None

        mv.percentile = row.percentile
        mv.save()
Django’s default behavior is to run in autocommit mode. Each query is immediately committed to the database, unless a transaction is actives.
By using with transaction.atomic()  all the inserts are grouped into a single transaction. The time needed to commit the transaction is amortized over all the enclosed insert statements and so the time per insert statement is greatly reduced.
                        这篇关于在Django中一次更新多个对象？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！