本文介绍了在集合列表之间查找相交集合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下问题在python 3.6上.假设我有一组列表,例如

The following question is on python 3.6. Suppose I have lists of sets, for example

L1 = [{2,7},{2,7,8},{2,3,6,7},{1,2,4,5,7}]
L2 = [{3,6},{1,3,4,6,7},{2,3,5,6,8}]
L3 = [{2,5,7,8},{1,2,3,5,7,8}, {2,4,5,6,7,8}]

我需要找到L1,L2和L3的每个元素之间的所有交集.例如:

I need to find all the intersection sets between each element of L1, L2, and L3. E.g.:

    {2,7}.intersection({3,6}).intersection({2,5,7,8})= empty
    {2,7}.intersection({3,6}).intersection({1,2,3,5,7,8})= empty
    {2,7}.intersection({3,6}).intersection({2,4,5,6,7,8})= empty
    {2,7}.intersection({1,3,4,6,7}).intersection({2,5,7,8})= {7}
    {2,7}.intersection({1,3,4,6,7}).intersection({1,2,3,5,7,8})= {7}
    {2,7}.intersection({1,3,4,6,7}).intersection({2,4,5,6,7,8})= {7}

...............................

...............................

如果继续这样做,我们将得到以下结果:

If we keep doing like this, we end up with the following set:

{{空},{2},{3},{6},{7},{2,3},{2,5},{2,6},{2,8},{3 ,7},{4,7},{6,7}}

{{empty},{2},{3},{6},{7},{2,3},{2,5},{2,6},{2,8},{3,7},{4,7},{6,7}}

假设:
-我有很多列表L1,L2,L3,... Ln.而且我不知道我有多少个列表.
-每个列表L1,L2,L3..Ln都很大,所以我无法将它们全部都加载到内存中.

Suppose:
- I have many lists L1, L2, L3,...Ln. And I do not know how many lists I have.
- Each list L1, L2, L3..Ln are big, so I can not load all of them into the memory.

我的问题是:有什么方法可以依次计算,例如,在L1和L2之间进行计算,然后使用结果与L3进行计算,等等...

My question is: Is there any way to calculate that set sequentially, e.g., calculate between L1 and L2, then using result to calculate with L3, and so on...

推荐答案

您可以首先计算L1和L2之间的所有可能交集,然后计算该集合与L3之间的交集,依此类推.

You can first calculate all possible intersections between L1 and L2, then calculate the intersections between that set and L3 and so on.

list_generator = iter([  # some generator that produces your lists
    [{2,7}, {2,7,8}, {2,3,6,7}, {1,2,4,5,7}],
    [{3,6}, {1,3,4,6,7}, {2,3,5,6,8}],
    [{2,5,7,8}, {1,2,3,5,7,8}, {2,4,5,6,7,8}],
])
# for example, you can read from a file:
# (adapt the format to your needs)
def list_generator_from_file(filename):
    with open(filename) as f:
        for line in f:
            yield list(map(lambda x: set(x.split(',')), line.strip().split('|')))
# list_generator would be then list_generator_from_file('myfile.dat')

intersections = next(list_generator)  # get first list
new_intersections = set()

for list_ in list_generator:
    for old in intersections:
        for new in list_:
            new_intersections.add(frozenset(old.intersection(new)))
    # at this point we don't need the current list any more
    intersections, new_intersections = new_intersections, set()

print(intersections)

输出看起来像{frozenset({7}), frozenset({3, 7}), frozenset({3}), frozenset({6}), frozenset({2, 6}), frozenset({6, 7}), frozenset(), frozenset({8, 2}), frozenset({2, 3}), frozenset({1, 7}), frozenset({4, 7}), frozenset({2, 5}), frozenset({2})},它与您所拥有的匹配,除了您错过的{1,7}设置.

Output looks like {frozenset({7}), frozenset({3, 7}), frozenset({3}), frozenset({6}), frozenset({2, 6}), frozenset({6, 7}), frozenset(), frozenset({8, 2}), frozenset({2, 3}), frozenset({1, 7}), frozenset({4, 7}), frozenset({2, 5}), frozenset({2})}, which matches what you have except for the {1,7} set you missed.

这篇关于在集合列表之间查找相交集合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

06-15 18:55