问题描述
我想根据它们的元素对多个列表进行排名,以它们在每个列表中出现的频率为准.示例:
I want to rank multiple lists according to their elements how often they appear in each list. Example:
列表1 = 1,2,3,4
list2 = 4,5,6,7
list3 = 4,1,8,9
list1 = 1,2,3,4
list2 = 4,5,6,7
list3 = 4,1,8,9
结果= 4,1,2,3,4,5,6,7,8(4被计算3次,1被计算2次,其余被计算一次)
result = 4,1,2,3,4,5,6,7,8 (4 is counted three times, 1 two times and the rest once)
我尝试了以下方法,但是我需要一些更智能的东西,并且可以使用任何数量的列表来做.
I've tried the following but i need something more intelligent and something i can do with any ammount of lists.
l = []
l.append([ 1, 2, 3, 4, 5])
l.append([ 1, 9, 3, 4, 5])
l.append([ 1, 10, 8, 4, 5])
l.append([ 1, 12, 13, 7, 5])
l.append([ 1, 14, 13, 13, 6])
x1 = set(l[0]) & set(l[1]) & set(l[2]) & set(l[3])
x2 = set(l[0]) & set(l[1]) & set(l[2]) & set(l[4])
x3 = set(l[0]) & set(l[1]) & set(l[3]) & set(l[4])
x4 = set(l[0]) & set(l[2]) & set(l[3]) & set(l[4])
x5 = set(l[1]) & set(l[2]) & set(l[3]) & set(l[4])
set1 = set(x1) | set(x2) | set(x3) | set(x4) | set(x5)
a1 = list(set(l[0]) & set(l[1]) & set(l[2]) & set(l[3]) & set(l[4]))
a2 = getDifference(list(set1),a1)
print a1
print a2
现在这是问题所在...我可以用a3,a4和a5一次又一次地做到这一点,但是它太复杂了,我需要一个函数...但是我不知道如何...我的数学陷入困境;)
Now here is the problem... i can do it again and again with a3,a4 and a5 but its too complex then, i need a function for this... But i don't know how... my math got stuck ;)
已解决:非常感谢您的讨论.作为一个新手,我以某种方式喜欢这个系统:快速+信息丰富.你帮我全力以赴! Ty
SOLVED: thanks alot for the discussion. As a newbee i like this system somehow: fast+informative. You helped me all out! Ty
推荐答案
import collections
data = [
[1, 2, 3, 4, 5],
[1, 9, 3, 4, 5],
[1, 10, 8, 4, 5],
[1, 12, 13, 7, 5],
[1, 14, 13, 13, 6],
]
def sorted_by_count(lists):
counts = collections.defaultdict(int)
for L in lists:
for n in L:
counts[n] += 1
return [num for num, count in
sorted(counts.items(),
key=lambda k_v: (k_v[1], k_v[0]),
reverse=True)]
print sorted_by_count(data)
现在让我们对其进行概括(采取任何可迭代的方法,放宽可散列的要求),允许键和反向参数(以匹配排序),然后重命名为 freq_sorted :
Now let's generalize it (to take any iterable, loosen hashable requirement), allow key and reverse parameters (to match sorted), and rename to freq_sorted:
def freq_sorted(iterable, key=None, reverse=False, include_freq=False):
"""Return a list of items from iterable sorted by frequency.
If include_freq, (item, freq) is returned instead of item.
key(item) must be hashable, but items need not be.
*Higher* frequencies are returned first. Within the same frequency group,
items are ordered according to key(item).
"""
if key is None:
key = lambda x: x
key_counts = collections.defaultdict(int)
items = {}
for n in iterable:
k = key(n)
key_counts[k] += 1
items.setdefault(k, n)
if include_freq:
def get_item(k, c):
return items[k], c
else:
def get_item(k, c):
return items[k]
return [get_item(k, c) for k, c in
sorted(key_counts.items(),
key=lambda kc: (-kc[1], kc[0]),
reverse=reverse)]
示例:
>>> import itertools
>>> print freq_sorted(itertools.chain.from_iterable(data))
[1, 5, 4, 13, 3, 2, 6, 7, 8, 9, 10, 12, 14]
>>> print freq_sorted(itertools.chain.from_iterable(data), include_freq=True)
# (slightly reformatted)
[(1, 5),
(5, 4),
(4, 3), (13, 3),
(3, 2),
(2, 1), (6, 1), (7, 1), (8, 1), (9, 1), (10, 1), (12, 1), (14, 1)]
这篇关于在Python中按列表数量对多个列表元素进行排名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!