我正在使用python 3。我使用的函数如下:
def sub_combinations(segment):
if len(segment) == 1:
yield (segment,)
else:
for j in sub_combinations(segment[1:]):
yield ((segment[0],),)+j
for k in range(len(j)):
yield (((segment[0],)+j[k]),) + (j[:k]) +(j[k+1:])
它是此函数的一个版本:
Yielding sub combinations
输出如下(1,2,3,4,5):
((1,), (2,), (3,), (4,), (5,))
((1, 2), (3,), (4,), (5,))
((1, 3), (2,), (4,), (5,))
((1, 4), (2,), (3,), (5,)) *
((1, 5), (2,), (3,), (4,)) *
((1,), (2, 3), (4,), (5,))
((1, 2, 3), (4,), (5,))
((1, 4), (2, 3), (5,)) *
((1, 5), (2, 3), (4,)) *
((1,), (2, 4), (3,), (5,))
((1, 2, 4), (3,), (5,))
((1, 3), (2, 4), (5,))
((1, 5), (2, 4), (3,)) *
((1,), (2, 5), (3,), (4,)) *
((1, 2, 5), (3,), (4,)) *
((1, 3), (2, 5), (4,)) *
((1, 4), (2, 5), (3,)) *
((1,), (2,), (3, 4), (5,))
((1, 2), (3, 4), (5,))
((1, 3, 4), (2,), (5,))
((1, 5), (2,), (3, 4)) *
((1,), (2, 3, 4), (5,))
((1, 2, 3, 4), (5,))
((1, 5), (2, 3, 4)) *
((1,), (2, 5), (3, 4)) *
((1, 2, 5), (3, 4)) *
((1, 3, 4), (2, 5)) *
((1,), (2,), (3, 5), (4,))
((1, 2), (3, 5), (4,))
((1, 3, 5), (2,), (4,))
((1, 4), (2,), (3, 5)) *
((1,), (2, 3, 5), (4,))
((1, 2, 3, 5), (4,))
((1, 4), (2, 3, 5)) *
((1,), (2, 4), (3, 5))
((1, 2, 4), (3, 5))
((1, 3, 5), (2, 4))
((1,), (2,), (3,), (4, 5))
((1, 2), (3,), (4, 5))
((1, 3), (2,), (4, 5))
((1, 4, 5), (2,), (3,)) *
((1,), (2, 3), (4, 5))
((1, 2, 3), (4, 5))
((1, 4, 5), (2, 3)) *
((1,), (2, 4, 5), (3,))
((1, 2, 4, 5), (3,))
((1, 3), (2, 4, 5))
((1,), (2,), (3, 4, 5))
((1, 2), (3, 4, 5))
((1, 3, 4, 5), (2,))
((1,), (2, 3, 4, 5))
((1, 2, 3, 4, 5),)
问题是,如果我使用较大的元组,函数sub_组合将返回大量数据,并且计算时间太长为了解决这个问题,我想通过添加一个额外的参数来限制返回的数据量。例如,子组合((1,2,3,4,5),2)应该返回上面的数据,但不返回用星标记的元组。因为元组中的consequentive值之间的偏移量大于2,所以删除这些值。例如,包含(1,4),(1,5)或(2,5)的行和(1,2,5)等的行被丢弃。
线
for k in range(len(j))
需要调整以删除这些行,但我还没有弄清楚如何删除。有什么建议吗?
巴里
最佳答案
我认为以下更改会导致您要查找的输出:
def sub_combinations(segment, max_offset=None):
data = tuple([e] for e in segment)
def _sub_combinations(segment):
if len(segment) == 1:
yield (segment,)
else:
for j in _sub_combinations(segment[1:]):
yield ((segment[0],),)+j
for k in range(len(j)):
if max_offset and data.index(j[k][0]) - data.index(segment[0]) > max_offset:
break
yield (((segment[0],)+j[k]),) + (j[:k]) +(j[k+1:])
for combination in _sub_combinations(data):
yield tuple(tuple(e[0] for e in t) for t in combination)
这里的想法是从
k
循环中分离出来,而不是生成一个偏移量大于max_offset
的元组。关于python - 极限产量子组合,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/8661959/