问题描述
我正在寻找生成相对大量数组的笛卡尔积,以跨越一个高维网格.由于维数高,无法将笛卡尔积计算的结果存储在内存中;而是将其写入硬盘.由于这种限制,我需要访问中间结果,因为它们是生成的.到目前为止,我一直在做的是:
I'm looking to generate the cartesian product of a relatively large number of arrays to span a high-dimensional grid. Because of the high dimensionality, it won't be possible to store the result of the cartesian product computation in memory; rather it will be written to hard disk. Because of this constraint, I need access to the intermediate results as they are generated. What I've been doing so far is this:
for x in xrange(0, 10):
for y in xrange(0, 10):
for z in xrange(0, 10):
writeToHdd(x,y,z)
除了非常讨厌之外,它还不能扩展(即,我需要编写与维度一样多的循环).我尝试使用建议的解决方案 ,但这是一个递归解决方案,因此很难即时生成结果.除了每个维度都有一个硬编码的循环之外,还有什么整洁"的方法可以做到这一点?
which, apart from being very nasty, is not scalable (i.e. it would require me writing as many loops as dimensions). I have tried to use the solution proposed here, but that is a recursive solution, which therefore makes it quite hard to obtain the results on the fly as they are being generated. Is there any 'neat' way to do this other than having a hardcoded loop per dimension?
推荐答案
在纯Python中,您可以使用 itertools.product
.
In plain Python, you can generate the Cartesian product of a collection of iterables using itertools.product
.
>>> arrays = range(0, 2), range(4, 6), range(8, 10)
>>> list(itertools.product(*arrays))
[(0, 4, 8), (0, 4, 9), (0, 5, 8), (0, 5, 9), (1, 4, 8), (1, 4, 9), (1, 5, 8), (1, 5, 9)]
在Numpy中,您可以组合 numpy.meshgrid
(通过sparse=True
以避免在内存中扩展产品),并带有 numpy.ndindex
:
In Numpy, you can combine numpy.meshgrid
(passing sparse=True
to avoid expanding the product in memory) with numpy.ndindex
:
>>> arrays = np.arange(0, 2), np.arange(4, 6), np.arange(8, 10)
>>> grid = np.meshgrid(*arrays, sparse=True)
>>> [tuple(g[i] for g in grid) for i in np.ndindex(grid[0].shape)]
[(0, 4, 8), (0, 4, 9), (1, 4, 8), (1, 4, 9), (0, 5, 8), (0, 5, 9), (1, 5, 8), (1, 5, 9)]
这篇关于与维度无关的(通用)笛卡尔积的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!