问题描述
有几个在Python中使用numpy生成所有组合的数组的优雅示例.例如此处的答案:使用numpy构建两个数组的所有组合的数组.
There are several elegant examples of using numpy in Python to generate arrays of all combinations. For example the answer here: Using numpy to build an array of all combinations of two arrays .
现在假设存在一个额外的约束,即所有数字的总和不能超过给定的常数K
.使用生成器和itertools.product
,例如K=3
的示例,我们想要三个范围为0-1,0-3和0-2的变量的组合,我们可以做到以下几点:
Now suppose there is an extra constraint, namely, the sum of all numbers cannot add up to more than a given constant K
. Using a generator and itertools.product
, for an example with K=3
where we want the combinations of three variables with ranges 0-1,0-3, and 0-2 we can do it a follows:
from itertools import product
K = 3
maxRange = np.array([1,3,2])
states = np.array([i for i in product(*(range(i+1) for i in maxRange)) if sum(i)<=K])
返回
array([[0, 0, 0],
[0, 0, 1],
[0, 0, 2],
[0, 1, 0],
[0, 1, 1],
[0, 1, 2],
[0, 2, 0],
[0, 2, 1],
[0, 3, 0],
[1, 0, 0],
[1, 0, 1],
[1, 0, 2],
[1, 1, 0],
[1, 1, 1],
[1, 2, 0]])
原则上,可以使用 https://stackoverflow.com/a/25655090/1479342 中的方法生成没有约束的所有可能组合,然后选择总和小于K
的组合子集.但是,这种方法产生的组合比必要的多得多,特别是如果K
与sum(maxRange)
相比较小.
In principle, the approach from https://stackoverflow.com/a/25655090/1479342 can be used to generate all possible combinations without the constraint and then selecting the subset of combinations that sum up to less than K
. However, that approach generates much more combinations than necessary, especially if K
is relatively small compared to sum(maxRange)
.
必须有一种方法可以更快地执行此操作,并减少内存使用量.如何使用矢量化方法(例如,使用np.indices
)来实现此目标?
There must be a way to do this faster and with lower memory usage. How can this be achieved using a vectorized approach (for example using np.indices
)?
推荐答案
已编辑
-
为完整起见,我在此处添加OP的代码:
For completeness, I'm adding here the OP's code:
def partition0(max_range, S):
K = len(max_range)
return np.array([i for i in itertools.product(*(range(i+1) for i in max_range)) if sum(i)<=S])
第一种方法是纯np.indices
.对于少量输入而言,它速度很快,但会占用大量内存(OP已经指出这不是他的意思).
The first approach is pure np.indices
. It's fast for small input but consumes a lot of memory (OP already pointed out it's not what he meant).
def partition1(max_range, S):
max_range = np.asarray(max_range, dtype = int)
a = np.indices(max_range + 1)
b = a.sum(axis = 0) <= S
return (a[:,b].T)
循环方法似乎比上述方法要好得多:
Recurrent approach seems to be much better than those above:
def partition2(max_range, max_sum):
max_range = np.asarray(max_range, dtype = int).ravel()
if(max_range.size == 1):
return np.arange(min(max_range[0],max_sum) + 1, dtype = int).reshape(-1,1)
P = partition2(max_range[1:], max_sum)
# S[i] is the largest summand we can place in front of P[i]
S = np.minimum(max_sum - P.sum(axis = 1), max_range[0])
offset, sz = 0, S.size
out = np.empty(shape = (sz + S.sum(), P.shape[1]+1), dtype = int)
out[:sz,0] = 0
out[:sz,1:] = P
for i in range(1, max_range[0]+1):
ind, = np.nonzero(S)
offset, sz = offset + sz, ind.size
out[offset:offset+sz, 0] = i
out[offset:offset+sz, 1:] = P[ind]
S[ind] -= 1
return out
经过短暂的思考,我可以进一步介绍它.如果我们事先知道可能的分区数,则可以一次分配足够的内存. (它与已链接的线程中的cartesian
有点相似.)
After a short thought, I was able to take it a bit further. If we know in advance the number of possible partitions, we can allocate enough memory at once. (It's somewhat similar to cartesian
in an already linked thread.)
首先,我们需要一个计算分区数量的函数.
First, we need a function which counts partitions.
def number_of_partitions(max_range, max_sum):
'''
Returns an array arr of the same shape as max_range, where
arr[j] = number of admissible partitions for
j summands bounded by max_range[j:] and with sum <= max_sum
'''
M = max_sum + 1
N = len(max_range)
arr = np.zeros(shape=(M,N), dtype = int)
arr[:,-1] = np.where(np.arange(M) <= min(max_range[-1], max_sum), 1, 0)
for i in range(N-2,-1,-1):
for j in range(max_range[i]+1):
arr[j:,i] += arr[:M-j,i+1]
return arr.sum(axis = 0)
主要功能:
def partition3(max_range, max_sum, out = None, n_part = None):
if out is None:
max_range = np.asarray(max_range, dtype = int).ravel()
n_part = number_of_partitions(max_range, max_sum)
out = np.zeros(shape = (n_part[0], max_range.size), dtype = int)
if(max_range.size == 1):
out[:] = np.arange(min(max_range[0],max_sum) + 1, dtype = int).reshape(-1,1)
return out
P = partition3(max_range[1:], max_sum, out=out[:n_part[1],1:], n_part = n_part[1:])
# P is now a useful reference
S = np.minimum(max_sum - P.sum(axis = 1), max_range[0])
offset, sz = 0, S.size
out[:sz,0] = 0
for i in range(1, max_range[0]+1):
ind, = np.nonzero(S)
offset, sz = offset + sz, ind.size
out[offset:offset+sz, 0] = i
out[offset:offset+sz, 1:] = P[ind]
S[ind] -= 1
return out
一些测试:
Some tests:
max_range = [3, 4, 6, 3, 4, 6, 3, 4, 6]
for f in [partition0, partition1, partition2, partition3]:
print(f.__name__ + ':')
for max_sum in [5, 15, 25]:
print('Sum %2d: ' % max_sum, end = '')
%timeit f(max_range, max_sum)
print()
partition0:
Sum 5: 1 loops, best of 3: 859 ms per loop
Sum 15: 1 loops, best of 3: 1.39 s per loop
Sum 25: 1 loops, best of 3: 3.18 s per loop
partition1:
Sum 5: 10 loops, best of 3: 176 ms per loop
Sum 15: 1 loops, best of 3: 224 ms per loop
Sum 25: 1 loops, best of 3: 403 ms per loop
partition2:
Sum 5: 1000 loops, best of 3: 809 µs per loop
Sum 15: 10 loops, best of 3: 62.5 ms per loop
Sum 25: 1 loops, best of 3: 262 ms per loop
partition3:
Sum 5: 1000 loops, best of 3: 853 µs per loop
Sum 15: 10 loops, best of 3: 59.1 ms per loop
Sum 25: 1 loops, best of 3: 249 ms per loop
还有更大的东西:
%timeit partition0([3,6] * 5, 20)
1 loops, best of 3: 11.9 s per loop
%timeit partition1([3,6] * 5, 20)
The slowest run took 12.68 times longer than the fastest. This could mean that an intermediate result is being cached
1 loops, best of 3: 2.33 s per loop
# MemoryError in another test
%timeit partition2([3,6] * 5, 20)
1 loops, best of 3: 877 ms per loop
%timeit partition3([3,6] * 5, 20)
1 loops, best of 3: 739 ms per loop
这篇关于生成一个numpy数组,其中所有数字的组合的总和小于给定的数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!