如何加快Django中大型数据集的迭代

如何加快Django中大型数据集的迭代

本文介绍了如何加快Django中大型数据集的迭代的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个来自Django ORM查询的大约1500条记录的查询集。我使用了select_related()和only()方法来确保查询很紧。我也使用了connection.queries来确保只有这一个查询。也就是说,我已经确保每次迭代都不会调用额外的查询。



当我从connection.queries运行查询剪切并粘贴时,它在0.02秒内运行。但是,需要七秒钟的时间来迭代这些记录,并且对它们没有任何影响(通过)。



我可以做些什么来加速?什么原因导致这种缓慢?

解决方案

当查询集完整的模型对象时,查询集可能会变得相当沉重。在类似的情况下,我使用了queryset中的.values方法来指定我需要的属性作为字典列表,可以快速迭代。


I have a query set of approximately 1500 records from a Django ORM query. I have used the select_related() and only() methods to make sure the query is tight. I have also used connection.queries to make sure there is only this one query. That is, I have made sure no extra queries are getting called on each iteration.

When I run the query cut and paste from connection.queries it runs in 0.02 seconds. However, it takes seven seconds to iterate over those records and do nothing with them (pass).

What can I do to speed this up? What causes this slowness?

解决方案

A QuerySet can get pretty heavy when it's full of model objects. In similar situations, I've used the .values method on the queryset to specify the properties I need as a list of dictionaries, which can be much faster to iterate over. http://docs.djangoproject.com/en/1.3/ref/models/querysets/#values-list

这篇关于如何加快Django中大型数据集的迭代的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-20 10:43