问题描述
我一直在根据生成器表达式比普通循环更有效的理论进行操作.但是随后我遇到了以下示例:写一个给定数字 N
的函数,并且某些因子 ps
返回下所有数字的总和N
是至少一个因子的倍数.
I have been operating under the theory that generator expressions tend to be more efficient than normal loops. But then I ran into the following example: write a function which given a number, N
, and some factors, ps
, returns the sum of all the numbers under N
that are a multiple of at least one factor.
这是一个循环版本和一个较短的生成器表达式版本:
Here is a loop version and a shorter generator expression version:
def loops(N, ps):
total_sum = 0
for i in xrange(N):
for p in ps:
if i%p == 0:
total_sum += i
break
return total_sum
def genexp(N, ps):
return sum(i for i in xrange(N)
if any(i%p == 0 for p in ps))
我希望两者的性能大致相同,也许理解版本会更快一些,但是我没想到的是:
I'd expect the two to perform roughly equal, with maybe the comprehension version a little faster, but what I didn't expect was this:
for func in ('loops', 'genexp'):
print func, timeit.timeit('%s(100000, [3,5,7])' % func,
number=100,
setup='from __main__ import %s' % func)
loops 2.82878184319
genexp 10.1663100719
慢4倍甚至还没有接近!为什么?我有什么误会?
4x slower isn't even close! Why? What am I misunderstanding?
推荐答案
首先:生成器表达式是内存有效的,不一定是速度有效的.
First of all: generator expressions are memory efficient, not necessarily speed efficient.
您的紧凑型 genexp()
版本较慢,原因有两个:
Your compact genexp()
version is slower for two reasons:
-
生成器表达式是使用新作用域(如新函数)实现的.您正在生成 N 个新范围,每个
any()
测试一个.创建新的作用域并再次将其拆解是相对昂贵的,当然,在循环中完成该操作之后,再与不执行此操作的代码进行比较.
Generator expressions are implemented using a new scope (like a new function). You are producing N new scopes, one for for each
any()
test. Creating a new scope and tearing it down again is relatively expensive, certainly when done in a loop and then compared with code that doesn't do this.
sum()
和 any()
名称是要查询的其他全局变量.对于 any()
,这是每个测试额外的 N 个全局查询.必须在字典中查找全局变量,而在C数组中通过索引查找局部变量(这非常快).
The sum()
and any()
names are additional globals to be looked up. In the case of any()
, that's an additional N global lookups per test. Globals must be looked up in a dictionary, versus locals which are looked up by index in a C-array (which is very fast).
后者只是一个很小的组成部分,大部分成本在于创建和销毁框架(范围);如果创建一个版本,其中 _any
和 _sum
是该函数的本地变量,则性能会有所改善:
The latter is but a small component, most of the cost lies in creating and destroying frames (scopes); if you create a version where _any
and _sum
are locals to the function you get but a small improvement in performance:
>>> def genexp_locals(N, ps, _any=any, _sum=sum):
... return _sum(i for i in xrange(N)
... if _any(i%p == 0 for p in ps))
...
>>> for func in ('loops', 'genexp', 'genexp_locals'):
... print func, timeit.timeit('%s(100000, [3,5,7])' % func,
... number=100,
... setup='from __main__ import %s' % func)
...
loops 2.00835800171
genexp 6.45241594315
genexp_locals 6.23843789101
我没有为 xrange
创建本地以保持相同的外观.从技术上讲,生成器表达式代码对象将 _any
名称作为闭包而不是局部的名称进行查找,其速度不像全局查找那样慢,但也不如局部查找那么快.
I didn't create a local for xrange
to keep that aspect the same. Technically speaking, the _any
name is looked up as a closure, not a local, by the generator expression code object, which are not as slow as global lookups but not quite as speedy as a local lookup either.
这篇关于为什么此生成器表达式函数比循环版本慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!