我有一个列表,比如:

names =  [['cat', 'fish'], ['cat'], ['fish', 'dog', 'cat'],
 ['cat', 'bird', 'fish'], ['fish', 'bird']]

我想计算一下在整个列表中同时提到的每一对名字的次数,输出如下:
{ ['cat', 'fish']: 3, ['cat', 'dog']: 1,['cat','bird']:1
 ['fish','dog'] : 1, ['fish','bird']:2}

我试过:
from collections import Counter
from collections import defaultdict

co_occurences = defaultdict(Counter)
for tags in names:
    for key in tags:
        co_occurences[key].update(tags)

print co_occurences

但它不计算主列表中的co=出现次数。

最佳答案

可以在python中使用按位和,并通过将列表列表转换为集合列表来进行比较

>>> set(['cat','dog']) & set(['cat','dog','monkey','horse','fish'])
set(['dog', 'cat'])


def listOccurences(item, names):
    # item is the list that you want to check, eg. ['cat','fish']
    # names contain the list of list you have.
    set_of_items = set(item) # set(['cat','fish'])
    count = 0
    for value in names:
        if set_of_items & set(value) == set_of_items:
            count+=1
    return count

names =  [['cat', 'fish'], ['cat'], ['fish', 'dog', 'cat'],['cat', 'bird', 'fish'], ['fish', 'bird']]
# Now for each of your possibilities which you can generate
# Chain flattens the list, set removes duplicates, and combinations generates all possible pairs.
permuted_values = list(itertools.combinations(set(itertools.chain.from_iterable(names)), 2))
d = {}
for v in permuted_values:
    d[str(v)] = listOccurences(v, names)
# The key in the dict being a list cannot be possible unless it's converted to a string.
print(d)
# {"['fish', 'dog']": 1, "['cat', 'dog']": 1, "['cat', 'fish']": 3, "['cat', 'bird']": 1, "['fish', 'bird']": 2}

关于python - Python在不同列表中同时出现的两项,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/42272311/

10-12 18:37
查看更多