问题描述
我的列表理解近似为:
[f(x) for x in l if f(x)]
其中l是列表,而f(x)是返回列表的昂贵函数.
Where l is a list and f(x) is an expensive function which returns a list.
我想避免对f(x)的每个非空出现两次评估f(x).有什么方法可以将其输出保存在列表推导中?
I want to avoid evaluating f(x) twice for every non-empty occurance of f(x). Is there some way to save its output within the list comprehension?
我可以删除最终条件,生成整个列表,然后修剪它,但这似乎很浪费.
I could remove the final condition, generate the whole list and then prune it, but that seems wasteful.
修改:
已提出两种基本方法:
内部生成器理解:
[y for y in (f(x) for x in l) if y]
或备忘录.
我认为内部生成器理解可以很好地解决上述问题.实际上,我确实简化了这个问题以使其明确,我真的想要:
I think the inner generator comprehension is elegant for the problem as stated. In actual fact I simplified the question to make it clear, I really want:
[g(x, f(x)) for x in l if f(x)]
对于这种更复杂的情况,我认为记忆可以产生更干净的最终结果.
For this more complicated situation, I think memoization produces a cleaner end result.
推荐答案
一种解决方案(如果您重复使用x的话最好)是记忆函数f,即创建包装器保存要求调用该函数的参数并保存它的函数,如果要求相同的值,则返回它.
A solution (the best if you have repeated value of x) would be to memoize the function f, i.e. to create a wrapper function that saves the argument by which the function is called and save it, than return it if the same value is asked.
以下是一个非常简单的实现:
a really simple implementation is the following:
storage = {}
def memoized(value):
if value not in storage:
storage[value] = f(value)
return storage[value]
[memoized(x) for x in l if memoized(x)]
,然后在列表理解中使用此功能.这种方法在两个条件下有效,一个是理论上的,一个是实践上的.第一个是函数 f 应该是确定性的,即在输入相同的情况下返回相同的结果,另一个是对象 x 可以用作字典.键.如果第一个无效,则应根据每次定义重新计算f,而如果第二个失败,则可以使用一些更健壮的方法.
and then use this function in the list comprehension. This approach is valid under two condition, one theoretical and one practical. The first one is that the function f should be deterministic, i.e. returns the same results given the same input, and the other is that the object x can be used as a dictionary keys. If the first one is not valid than you should recompute f each timeby definition, while if the second one fails it is possible to use some slightly more robust approaches.
您可以在网上找到很多备忘录的实现,而且我认为新版本的python也包含一些内容.
You can find a lot of implementation of memoization around the net, and I think that the new versions of python have something included in them too.
另一方面,不要将小L用作变量名,这是一个坏习惯,因为它在某些终端上可能会与i或1混淆.
On a side note, never use the small L as a variable name, is a bad habit as it can be confused with an i or a 1 on some terminals.
如前所述,使用生成器理解(以避免创建无用的重复临时对象)的可能解决方案将是以下表达式:
as commented, a possible solution using generators comprehension (to avoid creating useless duplicate temporaries) would be this expression:
[g(x, fx) for x, fx in ((x,f(x)) for x in l) if fx]
给定f的计算成本,原始列表中的重复次数和处置时的内存,您需要权衡选择.记忆化在空间速度上进行了权衡,这意味着它会跟踪保存每个结果的方式,因此,如果您有大量列表,则在内存占用方面可能会变得代价高昂.
You need to weight your choice given the computational cost of f, the number of duplication in the original list and memory at you disposition. Memoization make a space-speed tradeoff, meaning that it keep tracks of each result saving it, so if you have huge lists it can became costly on the memory occupation front.
这篇关于Python列表理解-希望避免重复评估的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!