我有以下词典列表:

artist_and_tags = [{u'Yo La Tengo': ['indie', 'indie rock', 'seen live', 'alternative', 'indie pop', 'rock', 'post-rock', 'dream pop', 'shoegaze', 'noise pop', 'folk', 'experimental', 'alternative rock', 'american', 'lo-fi', 'pop', 'new jersey', 'yo la tengo', 'usa', 'noise rock', '90s', 'noise', '00s', 'ambient', 'post-punk', '80s', 'mellow', 'psychedelic', 'hoboken', 'experimental rock', 'singer-songwriter', 'post rock', 'electronic', 'female vocalists', 'alt-country', 'dreamy', 'matador', 'chillout', 'instrumental', 'favorites', 'punk', 'electronica', 'slowcore', 'folk rock', 'new wave', 'jazz', 'eclectic', 'new york', 'emo']}, {u'Radiohead': ['alternative', 'alternative rock', 'rock', 'indie', 'electronic', 'seen live', 'british', 'britpop', 'indie rock', 'experimental', 'radiohead', 'progressive rock', '90s', 'electronica', 'art rock', 'experimental rock', 'post-rock', 'psychedelic', 'uk', 'male vocalists', 'pop', '00s', 'ambient', 'chillout', 'progressive', 'favorites', 'melancholic', 'awesome', 'overrated', 'english', 'beautiful', 'classic rock', 'genius', 'melancholy', 'better than radiohead', 'trip-hop', 'idm', 'indie pop', 'emo']}, {u'Portishead': ['trip-hop', 'electronic', 'female vocalists', 'chillout', 'trip hop', 'alternative', 'electronica', 'seen live', 'downtempo', 'british', 'indie', 'portishead', 'experimental', 'ambient', 'female vocalist', 'alternative rock', '90s', 'lounge', 'mellow', 'bristol', 'jazz', 'psychedelic', 'chill', 'melancholic', 'triphop', 'uk', 'rock', 'bristol sound', 'acid jazz', 'lo-fi']}]


我正用来获得艺术家之间的关联性。

为此,我正在做:

tags0 = set(artist_and_tags[0].values()[0])
tags1 = set(artist_and_tags[1].values()[0])
tags2 = set(artist_and_tags[2].values()[0])


然后:

intersection1 = tags0 & tags1
intersection2 = tags0 & tags2
intersection3 = tags1 & tags2


所以:

print (intersection1, len(intersection1), intersection2, len(intersection), intersection3, len(intersection3))


向我展示了“ Yo La Tengo”比“ Portishead”更接近“ Radiohead”,带有20个相交的标签。

该代码似乎有点多余,但是...

题:

有没有办法在for loop中使用此逻辑(或包装在简单的function中),使其与具有n artists(keys)的词典一起使用?

最佳答案

您可以使用itertools.combinations

import itertools
import collections

ArtistTags = collections.namedtuple('ArtistTags', ('name', 'tags'))
tags = (ArtistTags(artist, set(tags))
        for artists_dict in artist_and_tags
        for artist, tags in artists_dict.items())
artist_pairings = itertools.combinations(tags, 2)
intersections = ((len(a.tags & b.tags), a, b) for a, b in artist_pairings)
for n, a, b in sorted(intersections, reverse=True):
    print(n, a.name, b.name)


输出:

20 Yo La Tengo Radiohead
16 Yo La Tengo Portishead
16 Radiohead Portishead

关于python - Python-与字典相交,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/46020117/

10-12 20:53