numpy替代itertools产品的Python

numpy替代itertools产品的Python

本文介绍了numpy替代itertools产品的Python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用的列表大小不一.例如,alternativesList可以在一个迭代中包含4个列表,而在另一个迭代中包含7个列表.

I am using a list of list with varying sizes. For example alternativesList can include 4 lists in one iteration and 7 lists in the other.

我想做的是捕获不同列表中单词的每个组合.

What i am trying to do is capture every combination of words in different lists.

我们这么说

a= [1,2,3]
alternativesList.append(a)
b = ["a","b","c"]
alternativesList.append(b)

productList = itertools.product(*alternativesList)

将创建

[(1,'a'),(1,'b'),(1,'c'),(2,'a'),(2,'b'),(2,'c' ),(3,'a'),(3,'b'),(3,'c')]

[(1, 'a'), (1, 'b'), (1, 'c'), (2, 'a'), (2, 'b'), (2, 'c'), (3, 'a'), (3, 'b'), (3, 'c')]

这里的一个问题是我的productList太大了,可能导致内存问题.因此,我将productList作为对象使用,并在以后对其进行迭代.

One problem here is that my productList can be so large it can cause memory problems. So i am using productList as object and iterate over it later.

我想知道的是,有没有一种方法可以使用numpy创建相同的对象,而该对象的工作速度比itertools快?

What i want to know is that is there a way to create same object with numpy which works faster than itertools?

推荐答案

您可以通过显式指定复合dtype来避免numpy尝试查找catchall dtype引起的一些问题:

You can avoid some problems arising from numpy trying to find catchall dtype by explicitly specifying a compound dtype:

代码+一些时间:

import numpy as np
import itertools

def cartesian_product_mixed_type(*arrays):
    arrays = *map(np.asanyarray, arrays),
    dtype = np.dtype([(f'f{i}', a.dtype) for i, a in enumerate(arrays)])
    out = np.empty((*map(len, arrays),), dtype)
    idx = slice(None), *itertools.repeat(None, len(arrays) - 1)
    for i, a in enumerate(arrays):
        out[f'f{i}'] = a[idx[:len(arrays) - i]]
    return out.ravel()

a = np.arange(4)
b = np.arange(*map(ord, ('A', 'D')), dtype=np.int32).view('U1')
c = np.arange(2.)

np.set_printoptions(threshold=10)

print(f'a={a}')
print(f'b={b}')
print(f'c={c}')

print('itertools')
print(list(itertools.product(a,b,c)))
print('numpy')
print(cartesian_product_mixed_type(a,b,c))

a = np.arange(100)
b = np.arange(*map(ord, ('A', 'z')), dtype=np.int32).view('U1')
c = np.arange(20.)

import timeit
kwds = dict(globals=globals(), number=1000)

print()
print(f'a={a}')
print(f'b={b}')
print(f'c={c}')

print(f"itertools: {timeit.timeit('list(itertools.product(a,b,c))', **kwds):7.4f} ms")
print(f"numpy:     {timeit.timeit('cartesian_product_mixed_type(a,b,c)', **kwds):7.4f} ms")

a = np.arange(1000)
b = np.arange(1000, dtype=np.int32).view('U1')

print()
print(f'a={a}')
print(f'b={b}')

print(f"itertools: {timeit.timeit('list(itertools.product(a,b))', **kwds):7.4f} ms")
print(f"numpy:     {timeit.timeit('cartesian_product_mixed_type(a,b)', **kwds):7.4f} ms")

示例输出:

a=[0 1 2 3]
b=['A' 'B' 'C']
c=[0. 1.]
itertools
[(0, 'A', 0.0), (0, 'A', 1.0), (0, 'B', 0.0), (0, 'B', 1.0), (0, 'C', 0.0), (0, 'C', 1.0), (1, 'A', 0.0), (1, 'A', 1.0), (1, 'B', 0.0), (1, 'B', 1.0), (1, 'C', 0.0), (1, 'C', 1.0), (2, 'A', 0.0), (2, 'A', 1.0), (2, 'B', 0.0), (2, 'B', 1.0), (2, 'C', 0.0), (2, 'C', 1.0), (3, 'A', 0.0), (3, 'A', 1.0), (3, 'B', 0.0), (3, 'B', 1.0), (3, 'C', 0.0), (3, 'C', 1.0)]
numpy
[(0, 'A', 0.) (0, 'A', 1.) (0, 'B', 0.) ... (3, 'B', 1.) (3, 'C', 0.)
 (3, 'C', 1.)]

a=[ 0  1  2 ... 97 98 99]
b=['A' 'B' 'C' ... 'w' 'x' 'y']
c=[ 0.  1.  2. ... 17. 18. 19.]
itertools:  7.4339 ms
numpy:      1.5701 ms

a=[  0   1   2 ... 997 998 999]
b=['' '\x01' '\x02' ... 'ϥ' 'Ϧ' 'ϧ']
itertools: 62.6357 ms
numpy:      8.0249 ms

这篇关于numpy替代itertools产品的Python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-05 10:32