我需要从文件(* .txt)中提取唯一的字符串。但是我编写的代码使同一行重复出现。我需要使每个唯一的字符串发出一次。

import re
f=open('C:\\isg-2000.txt')

p=f.readlines()
print len(p)
for i in range(len(p)):
  S = re.findall(r'set vrouter \".+?\"',p[i])
  if S:

    print S


这样的输出:

4438
['set vrouter "untrust-vr"']
['set vrouter "trust-vr"']
['set vrouter "UntrustGi-vr"']
['set vrouter "TrustGi-vr"']
['set vrouter "CNDT-vr"']
['set vrouter "MGT"']
['set vrouter "MGT"']
['set vrouter "MGT"']
['set vrouter "untrust-vr"']
['set vrouter "trust-vr"']
['set vrouter "UntrustGi-vr"']
['set vrouter "TrustGi-vr"']
['set vrouter "CNDT-vr"']
['set vrouter "MGT"']
['set vrouter "untrust-vr"']
['set vrouter "trust-vr"']
['set vrouter "UntrustGi-vr"']
['set vrouter "TrustGi-vr"']
['set vrouter "CNDT-vr"']
['set vrouter "MGT"']

最佳答案

set与生成器表达式一起使用:

import re
with open('C:\\isg-2000.txt') as f:
   r = re.compile(r'set vrouter \".+?\"')
   unique_matches = set(m for line in f for m in r.findall(line))


请注意,如果订单事项使用collections.OrderedDict,则集合不会保留订单

from collections import OrderedDict
...
unique_matches = list(OrderedDict.fromkeys(m for line in f for m in r.findall(line)))

关于python - 通过功能findall获得唯一的字符串,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/21456557/

10-08 21:44