问题描述
我正在尝试使用itertools.product创建一个由所有可能的资产分配组成的numpy数组.条件是每个资产的分配范围可以在0到100%之间,并且可以增加(100%/资产数量)增量.分配总金额应为100%.
I'm trying to create a numpy array consisting of all possible asset allocations using itertools.product.The conditions are that allocations for each asset can be in range of zero to 100% and can rise by (100% / number of assets) increments. The allocations total sum should be 100%.
资产数量增加时,计算将花费很长时间(7个资产需要10秒,8个资产需要210秒,依此类推).有没有办法以某种方式加快代码的速度?也许我应该尝试使用it.takewhile或多处理?
The calculations take very long time when assets number grows (10 seconds for 7 assets, 210 seconds for 8 assets and so on).Is there a way to speed up the code somehow?Maybe i should try using it.takewhile or multiprocessing?
import itertools as it
import numpy as np
def CreateMatrix(Increments):
inputs = it.product(np.arange(0, 1 + Increments, Increments), repeat = int(1/Increments));
matrix = np.ndarray((1, int(1/Increments)));
x = 0;
for i in inputs:
if np.sum(i, axis = 0) == 1:
if x > 0:
matrix = np.r_[matrix, np.ndarray((1, int(1/Increments)))]
matrix[x] = i
x = x + 1
return matrix
Assets = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
Increments = 1.0 / len(Assets)
matrix = CreateMatrix(Increments);
print matrix
推荐答案
使用stdlib sum
代替numpy.sum
.据cProfile称,这段代码大部分时间都在计算该总和.
Use the stdlib sum
instead of numpy.sum
. This code spends most of its time computing that sum, according to cProfile.
配置代码
import cProfile, pstats, StringIO
import itertools as it
import numpy as np
def CreateMatrix(Increments):
inputs = it.product(np.arange(0, 1 + Increments, Increments), repeat = int(1/Increments));
matrix = np.ndarray((1, int(1/Increments)));
x = 0
for i in inputs:
if np.sum(i, axis=0) == 1:
if x > 0:
matrix = np.r_[matrix, np.ndarray((1, int(1/Increments)))]
matrix[x] = i
x += 1
return matrix
pr = cProfile.Profile()
pr.enable()
Assets = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
Increments = 1.0 / len(Assets)
matrix = CreateMatrix(Increments);
print matrix
pr.disable()
s = StringIO.StringIO()
sortby = 'cumulative'
ps = pstats.Stats(pr, stream=s).sort_stats(sortby)
ps.print_stats()
print s.getvalue()
截断的输出
301565912 function calls (301565864 primitive calls) in 294.255 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 26.294 26.294 294.254 294.254 product.py:7(CreateMatrix)
43046721 41.948 0.000 267.762 0.000 Library/Python/2.7/lib/python/site-packages/numpy/core/fromnumeric.py:1966(sum)
43046723 60.071 0.000 217.863 0.000 Library/Python/2.7/lib/python/site-packages/numpy/core/fromnumeric.py:69(_wrapreduction)
43046723 124.341 0.000 124.341 0.000 {method 'reduce' of 'numpy.ufunc' objects}
43046723 14.630 0.000 14.630 0.000 Library/Python/2.7/lib/python/site-packages/numpy/core/fromnumeric.py:70(<dictcomp>)
43046721 12.629 0.000 12.629 0.000 {getattr}
43098200 7.958 0.000 7.958 0.000 {isinstance}
43046724 6.191 0.000 6.191 0.000 {method 'items' of 'dict' objects}
6434 0.047 0.000 0.199 0.000 Library/Python/2.7/lib/python/site-packages/numpy/lib/index_tricks.py:316(__getitem__)
定时实验
numpy.sum
import itertools as it
import numpy as np
def CreateMatrix(Increments):
inputs = it.product(np.arange(0, 1 + Increments, Increments), repeat = int(1/Increments));
matrix = np.ndarray((1, int(1/Increments)));
x = 0;
for i in inputs:
if np.sum(i, axis = 0) == 1:
if x > 0:
matrix = np.r_[matrix, np.ndarray((1, int(1/Increments)))]
matrix[x] = i
x = x + 1
return matrix
Assets = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
Increments = 1.0 / len(Assets)
matrix = CreateMatrix(Increments);
$ python -m timeit --number=3 --verbose "$(cat product.py)"
raw times: 738 696 697
3 loops, best of 3: 232 sec per loop
Stdlib sum
import itertools as it
import numpy as np
def CreateMatrix(Increments):
inputs = it.product(np.arange(0, 1 + Increments, Increments), repeat = int(1/Increments));
matrix = np.ndarray((1, int(1/Increments)));
x = 0;
for i in inputs:
if sum(i) == 1:
if x > 0:
matrix = np.r_[matrix, np.ndarray((1, int(1/Increments)))]
matrix[x] = i
x = x + 1
return matrix
Assets = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
Increments = 1.0 / len(Assets)
matrix = CreateMatrix(Increments);
$ python -m timeit --number=3 --verbose "$(cat product.py)"
raw times: 90.5 84.3 85.3
3 loops, best of 3: 28.1 sec per loop
正如其他人在评论中所说,还有更多方法可以更快地获得解决方案.看看如何进行多进程"处理, itertools产品模块?,了解如何使用multiprocessing
来加快速度.无论您做什么:聪明的算法,并发或同时使用两者,请替换sum函数;只需很少的努力就可以大大提高速度.
There are many more ways to get your solution faster, as other folks have said in their comments. Take a look at How do I "multi-process" the itertools product module? for an idea of how to use multiprocessing
to speed this up. No matter what you do: clever algorithm, concurrency or both, replace the sum function; it's a lot of speed up for very little effort.
这篇关于在python中加快itertools.product的方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!