本文介绍了在python字典中为每个唯一键计算唯一值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有这样的字典:
yahoo.com|98.136.48.100
yahoo.com|98.136.48.105
yahoo.com|98.136.48.110
yahoo.com|98.136.48.114
yahoo.com|98.136.48.66
yahoo.com|98.136.48.71
yahoo.com|98.136.48.73
yahoo.com|98.136.48.75
yahoo.net|98.136.48.100
g03.msg.vcs0|98.136.48.105
其中我有重复的键和值。而我想要的是一个唯一键(ips)和唯一值计数(域)的最终字典。我有laready以下代码:
in which I have repetitive keys and values. And what I want is a final dictionary with unique keys (ips) and count of unique values (domains). I have laready below code:
for dirpath, dirs, files in os.walk(path):
for filename in fnmatch.filter(files, '*.txt'):
with open(os.path.join(dirpath, filename)) as f:
for line in f:
if line.startswith('.'):
ip = line.split('|',1)[1].strip('\n')
semi_domain = (line.rsplit('|',1)[0]).split('.',1)[1]
d[ip]= semi_domains
if ip not in d:
key = ip
val = [semi_domain]
domains_per_ip[key]= val
但这不能正常工作。有人可以帮助我吗?
but this is not working properly. Can somebody help me out with this?
推荐答案
使用defaultdict:
Use a defaultdict:
from collections import defaultdict
d = defaultdict(set)
with open('somefile.txt') as thefile:
for line in the_file:
if line.strip():
value, key = line.split('|')
d[key].add(value)
for k,v in d.iteritems(): # use d.items() in Python3
print('{} - {}'.format(k, len(v)))
这篇关于在python字典中为每个唯一键计算唯一值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!