我想实现以下算法。对于nk,请按排序顺序考虑具有重复的所有组合,其中我们从具有重复的k中选择{0,..n-1}个数。例如,如果n=5k =3我们有:
[(0,0,0),(0,0,1),(0,0,2),(0,0,3),(0,0,4),(0,1,1),(0,
1,2),(0,1,3),(0,1,4),(0,2,2),(0,2,3),(0,2,4),(0,3,
3),(0,3,4),(0,4,4),(1,1,1,(1,2),(1,1,3),(1,1,4),
(1,2,2),(1,2,3),(1,2,4),(1,3,3),(1,3,4),(1,4,4),(2,
2,2,(2,2,3),(2,2,4),(2,3,3),(2,3,4),(2,4,4),(3,3,
3),(3,3,4),(3,4,4),(4,4,4)]
从现在起,我将把每一个组合视为一个多集。我想贪婪地浏览这些多集并将列表分割。一个分区的属性是它内所有多集的交集的大小必须至少小于cc>。在这种情况下,我们有:

(0, 0, 0), (0, 0, 1), (0, 0, 2), (0, 0, 3), (0, 0, 4)

然后
 (0, 1, 1), (0, 1, 2), (0, 1, 3), (0, 1, 4)

然后
(0, 2, 2), (0, 2, 3), (0, 2, 4)

然后
(0, 3,  3), (0, 3, 4)

然后
(0, 4, 4)

等等。
在python中,可以按如下方式迭代组合:
import itertools
for multiset in itertools.combinations_with_replacement(range(5),3):
    #Greedy algo

如何创建这些分区?
我遇到的一个问题是如何计算多集的交集的大小。例如,多集k-1(2,1,2)的交集的大小为2。
以下是(3,2,2)的完整答案。
(0, 0, 0, 0), (0, 0, 0, 1), (0, 0, 0, 2), (0, 0, 0, 3)
(0, 0, 1, 1), (0, 0, 1, 2), (0, 0, 1, 3)
(0, 0, 2, 2), (0, 0, 2, 3)
(0, 0, 3, 3)
(0, 1, 1, 1), (0, 1, 1, 2), (0, 1, 1, 3)
(0, 1, 2, 2), (0, 1, 2, 3)
(0, 1, 3, 3)
(0, 2, 2, 2), (0, 2, 2, 3)
(0, 2, 3, 3), (0, 3, 3, 3)
(1, 1, 1, 1), (1, 1, 1, 2), (1, 1, 1, 3)
(1, 1, 2, 2), (1, 1, 2, 3)
(1, 1, 3, 3)
(1, 2, 2, 2), (1, 2, 2, 3)
(1, 2, 3, 3), (1, 3, 3, 3)
(2, 2, 2, 2), (2, 2, 2, 3)
(2, 2, 3, 3), (2, 3, 3, 3)
(3, 3, 3, 3)

最佳答案

创建分区的一种方法是遍历迭代器,然后将每个multiset*与前一个进行比较。我测试了4种方法**来比较多集,最快的方法是测试membershipin前一个多集的迭代器,一旦成员测试失败,它就会被消耗并短路。如果multiset和前一个multiset中相等项的数目等于multiset的长度减去1,则满足对它们进行分组的条件。然后,生成一个lists的输出生成器,在这里,您可以append满足前一个list的条件的项,并启动一个包含list的新tuple否则,将yield一次一个地对组进行操作,以最小化内存使用:

import itertools

def f(n,k):
    prev, group = None, []
    for multiset in itertools.combinations_with_replacement(range(n),k):
        if prev:
            it = iter(prev)
            for idx, item in enumerate(multiset):
                if item not in it:
                    break
            if idx == len(multiset) - 1:
                group.append(multiset)
                continue
        if group:
            yield group
        group = [multiset]
        prev = multiset
    yield group

测试用例
输入:
for item in f(4,4):
    print(item)

输出:
[(0, 0, 0, 0), (0, 0, 0, 1), (0, 0, 0, 2), (0, 0, 0, 3)]
[(0, 0, 1, 1), (0, 0, 1, 2), (0, 0, 1, 3)]
[(0, 0, 2, 2), (0, 0, 2, 3)]
[(0, 0, 3, 3)]
[(0, 1, 1, 1), (0, 1, 1, 2), (0, 1, 1, 3)]
[(0, 1, 2, 2), (0, 1, 2, 3)]
[(0, 1, 3, 3)]
[(0, 2, 2, 2), (0, 2, 2, 3)]
[(0, 2, 3, 3), (0, 3, 3, 3)]
[(1, 1, 1, 1), (1, 1, 1, 2), (1, 1, 1, 3)]
[(1, 1, 2, 2), (1, 1, 2, 3)]
[(1, 1, 3, 3)]
[(1, 2, 2, 2), (1, 2, 2, 3)]
[(1, 2, 3, 3), (1, 3, 3, 3)]
[(2, 2, 2, 2), (2, 2, 2, 3)]
[(2, 2, 3, 3), (2, 3, 3, 3)]
[(3, 3, 3, 3)]

输入:
for item in f(5,3):
    print(item)

输出:
[(0, 0, 0), (0, 0, 1), (0, 0, 2), (0, 0, 3), (0, 0, 4)]
[(0, 1, 1), (0, 1, 2), (0, 1, 3), (0, 1, 4)]
[(0, 2, 2), (0, 2, 3), (0, 2, 4)]
[(0, 3, 3), (0, 3, 4)]
[(0, 4, 4)]
[(1, 1, 1), (1, 1, 2), (1, 1, 3), (1, 1, 4)]
[(1, 2, 2), (1, 2, 3), (1, 2, 4)]
[(1, 3, 3), (1, 3, 4)]
[(1, 4, 4)]
[(2, 2, 2), (2, 2, 3), (2, 2, 4)]
[(2, 3, 3), (2, 3, 4)]
[(2, 4, 4)]
[(3, 3, 3), (3, 3, 4)]
[(3, 4, 4), (4, 4, 4)]

*我称它们为multiset是为了与您的术语相匹配,但实际上是它们的tuples(有序且不可变的数据结构);使用collections.Counter对象(例如Counter((0, 0, 0, 1))returnsCounter({0: 3, 1: 1}))和递减就像一个真正的multiset方法,但我发现这要慢一些,因为使用顺序实际上很有用。
**其他与我测试的输出相同的较慢的函数:
def f2(n,k):
    prev, group = None, []
    for multiset in itertools.combinations_with_replacement(range(n),k):
        if prev:
            if sum(item1 == item2 for item1, item2 in zip(prev,multiset)) == len(multiset) - 1:
                group.append(multiset)
                continue
        if group:
            yield group
        group = [multiset]
        prev = multiset
    yield group

def f3(n,k):
    prev, group = None, []
    for multiset in itertools.combinations_with_replacement(range(n),k):
        if prev:
            lst = list(prev)
            for item in multiset:
                if item in lst:
                    lst.remove(item)
                else:
                    break
            if len(multiset) - len(lst) == len(multiset) - 1:
                group.append(multiset)
                continue
        if group:
            yield group
        group = [multiset]
        prev = multiset
    yield group

import collections
def f4(n,k):
    prev, group = None, []
    for multiset in itertools.combinations_with_replacement(range(n),k):
        if prev:
            if sum((collections.Counter(prev) - collections.Counter(multiset)).values()) == 1:
                group.append(multiset)
                continue
        if group:
            yield group
        group = [multiset]
        prev = multiset
    yield group

计时示例:
from timeit import timeit
list(f(11,10)) == list(f2(11,10)) == list(f3(11,10)) == list(f4(11,10))
# True
timeit(lambda: list(f(11,10)), number = 10)
# 4.19157001003623
timeit(lambda: list(f2(11,10)), number = 10)
# 7.32002648897469
timeit(lambda: list(f3(11,10)), number = 10)
# 6.236868146806955
timeit(lambda: list(f4(11,10)), number = 10)
# 47.20136355608702

注意,由于生成了大量的组合,对于nk的大值,所有方法都变得缓慢。

09-08 11:50