一再追加到大名单

一再追加到大名单

本文介绍了一再追加到大名单(Python的2.6.6)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个项目,我在ASCII值通过串口读取从微控制器(看起来像这样:AA FF BA 11 43 CF等)
输入被快速进来(38两个字符集/秒)。
我采取这一输入,并将其追加到所有的测量运行列表。

约5小时后,我的名单已发展到约855000项。

我。据了解,名单变得越大,速度越慢列表操作变得。我的意图是已经此测试运行24小时,这应该产生围绕3M结果

有没有追加到一个列表,然后list.append更高效,更快捷的方式()?

谢谢大家。


解决方案

That's not true in general. Lists in Python are, despite the name, not linked lists but arrays. There are operations that are O(n) on arrays (copying and searching, for instance), but you don't seem to use any of these. As a rule of thumb: If it's widely used and idiomatic, some smart people went and chose a smart way to do it. list.append is a widely-used builtin (and the underlying C function is also used in other places, e.g. list comprehensions). If there was a faster way, it would already be in use.

As you will see when you inspect the source code, lists are overallocating, i.e. when they are resized, they allocate more than needed for one item so the next n items can be appended without need to another resize (which is O(n)). The growth isn't constant, it is proportional with the list size, so resizing becomes rarer as the list grows larger. Here's the snippet from listobject.c:list_resize that determines the overallocation:

/* This over-allocates proportional to the list size, making room
 * for additional growth.  The over-allocation is mild, but is
 * enough to give linear-time amortized behavior over a long
 * sequence of appends() in the presence of a poorly-performing
 * system realloc().
 * The growth pattern is:  0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
 */
new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6);

As Mark Ransom points out, older Python versions (<2.7, 3.0) have a bug that make the GC sabotage this. If you have such a Python version, you may want to disable the gc. If you can't because you generate too much garbage (that slips refcounting), you're out of luck though.

这篇关于一再追加到大名单(Python的2.6.6)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-07 02:58