如何计算列表中对的频率?

本文介绍了如何计算列表中对的频率?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

现在我可以计算列表中每个单词的频率.

 >>>列表 =['a', 'b', 'a', 'c', 'a', 'c']频率 = {}对于 w 的话:频率[w] = 频率.get(w, 0) + 1返回频率

它给了我这个输出:

{'a': 3, 'b': 1, 'c: 2'}

但我希望它给我的是每个列表项的配对频率.例如，'b' 出现在 'a' 之后 1 次，'c' 出现在 'a' 之后 2 次.

{'a':{'b':1,'c':2},'b':{'a':1},'c':{'a':1}}

我将如何实现这一目标?

解决方案

如果您愿意接受稍微不同的格式，使用 collections.Counter 和 很容易获得成对计数>邮编:

>>>seq = list("abacac")>>>从集合导入计数器>>>c = 计数器 (zip(seq, seq[1:]))>>>CCounter({('a', 'c'): 2, ('b', 'a'): 1, ('c', 'a'): 1, ('a', 'b'): 1})

如果你真的想要你给出的格式，你有几个选择，但一种方法是使用 itertools.groupby 将所有以相同元素开头的对收集在一起:

>>>从 itertools 导入 groupby>>>grouped = groupby(sorted(zip(seq, seq[1:])), lambda x: x[0])>>>{k: dict(Counter(x[1] for x in g)) for k,g in grouped}{'a': {'c': 2, 'b': 1}, 'c': {'a': 1}, 'b': {'a': 1}}

Right now I am able to count the frequency of each word in a list.

    >>> list =['a', 'b', 'a', 'c', 'a', 'c']

frequency = {}
for w in words:
    frequency[w] = frequency.get(w, 0) + 1
return frequency

It gives me this output:

But what I'd like for it to give me is the frequency of pairs for each list item. For example, 'b' comes after 'a' 1 time and 'c' comes after 'a' 2 times.

How would I go about accomplishing this?

解决方案

If you're willing to accept a slightly different format, it's easy to get the pairwise counts using collections.Counter and zip:

>>> seq = list("abacac")
>>> from collections import Counter
>>> c = Counter(zip(seq, seq[1:]))
>>> c
Counter({('a', 'c'): 2, ('b', 'a'): 1, ('c', 'a'): 1, ('a', 'b'): 1})

If you really want the format you gave, you have a few options, but one way would be to use itertools.groupby to collect all the pairs starting with the same element together:

>>> from itertools import groupby
>>> grouped = groupby(sorted(zip(seq, seq[1:])), lambda x: x[0])
>>> {k: dict(Counter(x[1] for x in g)) for k,g in grouped}
{'a': {'c': 2, 'b': 1}, 'c': {'a': 1}, 'b': {'a': 1}}

这篇关于如何计算列表中对的频率?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！