优化与dict.update（）类似的功能，但添加了常用值

本文介绍了优化与dict.update（）类似的功能，但添加了常用值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

嘿伙计们，

我以为我会把它扔出去，因为每个人都喜欢优化

问题（这是真的，结账对这类

问题的回复数量！）

现在我有一个如下所示的功能。我想结合两个

词典，并将值加在一起，哪里有重叠。

我认为这个功能会相当快，但我在每个字典中都有大约1

百万的键（约80％重叠）并且它只运行所有

晚上。

所以也许我的功能不是O（n）？我真的认为它可能是

字典查找是O（nlogn）？

无论如何这里是函数。任何关于完成这项任务的想法很快就会被贬值。

谢谢！

< code apology =抱歉，如果gmail杀死我的标签>

def add_freqs（freq1，freq2）：

"""添加两个单词freq dicts" ;"

newfreq = {}

为密钥，值为freq1.items（）：

newfreq [key] = value + freq2 .get（key，0）

表示密钥，值为freq2.items（）：

newfreq [key] = value + freq1.get（key，0）

返回newfreq

freq1 = {

''dog''：1，

''猫''：2，

''人类''：3

}

freq2 = {

' 'perro''：1，

''gato''：1，

''human''：2

}

print add_freqs（freq1，freq2）

answer_I_want = {

''dog''：1，

''猫''：2，

''perro''：1，

''gato''：1，

' '人''：5
}

< / code>

-

Gregory Pi？ero

首席创新官

混合技术

（）

解决方案

-

Gregory Pi？ero

首席创新官

混合技术

（）

使用items（）您将整个字典复制到元组列表;
iteritems（）只是遍历现有字典并创建一个元组
一次。

80％重叠，你正在查找并在你的for循环中设置两个五个值中的四个。

转储对称并尝试以下方法之一：

def add_freqs2（freq1，freq2）：
total = dict（freq1）
for key，freq2中的值.iteritems（）：
如果键在freq1：
总[键] + =值
否则：
总[键] =值
返回总数

def add_freqs3（freq1，freq2）：
total = dict（freq1）
for key，value in freq2.iteritems（）：
尝试：
total [key] + = value
除了KeyError：
总计[key] = value
返回总数

我的猜测是add_freqs3（）表现最好。

谢谢Peter，这些都是非常好的想法。我迫不及待地想在今晚试一试。

这是关于你的功能的问题。如果我只看了
freq2中的键，那么我不会错过freq1中的任何键而不是freq2中的键吗？
这就是为什么我在原始函数中有两个循环的原因。

不，因为声明

total = dict（freq1）将total创建为freq1的浅表副本。因此

剩下要做的就是从freq2添加物品。

问候

史蒂夫

-

Steve Holden +44 150 684 7255 +1 800 494 3119

Holden Web LLC

PyCon TX 2006

Hey guys,

I thought I''d throw this out there since everyone loves optimization
questions (it''s true, check out the number of replies to those type of
questions!)

Right now I have a function shown below. I want to combine two
dictionaries and add the values together where-ever there is overlap.
I figured this function would be reasonably fast, but I have around 1
million keys in each dictionary (~80% overlap) and it just runs all
night.

So maybe my function is not O(n)? I really figured it was but maybe
dictionary lookups are O(nlogn)?

Anyway here is the function. Any ideas on doing this task a lot
faster would be appriciated.

Thanks!

<code apology=sorry if gmail kills my tabs>

def add_freqs(freq1,freq2):
"""Add two word freq dicts"""
newfreq={}
for key,value in freq1.items():
newfreq[key]=value+freq2.get(key,0)
for key,value in freq2.items():
newfreq[key]=value+freq1.get(key,0)
return newfreq

freq1={
''dog'':1,
''cat'':2,
''human'':3
}
freq2={
''perro'':1,
''gato'':1,
''human'':2
}
print add_freqs(freq1,freq2)

answer_I_want={
''dog'':1,
''cat'':2,
''perro'':1,
''gato'':1,
''human'':5
}
</code>
--
Gregory Pi?ero
Chief Innovation Officer
Blended Technologies
(www.blendedtechnologies.com)

解决方案

With items() you copy the whole dictionary into a list of tuples;
iteritems() just walks over the existing dictionary and creates one tuple
at a time.

With "80% overlap", you are looking up and setting four out of five values
twice in your for-loops.

Dump the symmetry and try one of these:

def add_freqs2(freq1, freq2):
total = dict(freq1)
for key, value in freq2.iteritems():
if key in freq1:
total[key] += value
else:
total[key] = value
return total

def add_freqs3(freq1, freq2):
total = dict(freq1)
for key, value in freq2.iteritems():
try:
total[key] += value
except KeyError:
total[key] = value
return total

My guess is that add_freqs3() will perform best.

Peter

--
Gregory Pi?ero
Chief Innovation Officer
Blended Technologies
(www.blendedtechnologies.com)

Thanks Peter, those are some really good ideas. I can''t wait to try
them out tonight.

Here''s a question about your functions. if I only look at the keys in
freq2 then won''t I miss any keys that are in freq1 and not in freq2?
That''s why I have the two loops in my original function.

No, because the statement

total = dict(freq1) creates total as a shallow copy of freq1. Thus
all that remains to be done is to add the items from freq2.

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006 www.python.org/pycon/

这篇关于优化与dict.update（）类似的功能，但添加了常用值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！