问题描述
我有一个具有非常简单结构的true/false值数组:
I have an array of true/false values with a very simple structure:
# the real array has hundreds of thousands of items
positions = np.array([True, False, False, False, True, True, True, True, False, False, False], dtype=np.bool)
我想遍历此数组并输出发生更改的位置(true变为false或相反).为此,我提出了两种不同的方法:
I want to traverse this array and output the places where changes happen (true becomes false or the contrary). For this purpose, I've put together two different approaches:
- 递归二进制搜索(查看所有值是否相同,如果不相同,则一分为二,然后递归)
- 纯粹的迭代搜索(遍历所有元素并与上一个/下一个进行比较)
两个版本都能提供我想要的结果,但是Numba对另一个的影响更大.对于一个300k值的虚拟数组,以下是性能结果:
Both versions give exactly the result that I want, however Numba has a greater effect on one than another. With a dummy array of 300k values, here are the performance results:
结果是,使用Numba时,binary_search比iterative_search慢5倍,而从理论上讲,它应该快100倍(如果正确加速,则预计运行时间为9 µs).
As a result, when using Numba, binary_search is 5x slower than iterative_search, while in theory it should be 100x faster (it should be expected to run in 9 µs if it was properly accelerated).
如何使Numba加速二进制搜索和加速迭代搜索?
这两种方法的代码(以及示例position
数组)都可以在以下公开要点上找到: https://gist.github.com/JivanRoquet/d58989aa0a4598e060ec2c705b9f3d8f
Code for both approaches (along with a sample position
array) is available on this public gist: https://gist.github.com/JivanRoquet/d58989aa0a4598e060ec2c705b9f3d8f
注意:Numba不在对象模式下运行binary_search()
,因为在提及nopython=True
时,它不会抱怨并很高兴地编译该函数.
Note: Numba is not running binary_search()
in object mode, because when mentioning nopython=True
, it doesn't complain and happily compiles the function.
推荐答案
您可以使用np.diff
查找值更改的位置,无需运行更复杂的算法或使用numba
:
You can find the positions of value changes by using np.diff
, there is no need to run a more complicated algorithm, or to use numba
:
positions = np.array([True, False, False, False, True, True, True, True, False, False, False], dtype=np.bool)
dpos = np.diff(positions)
# array([ True, False, False, True, False, False, False, True, False, False])
这可行,因为False - True == -1
和np.bool(-1) == True
.
在我的电池供电下(=由于节能模式而节流),以及使用了几年的笔记本电脑,它的性能都很好.
It performs quite well on my battery powered (= throttled due to energy saving mode), and several years old laptop:
In [52]: positions = np.random.randint(0, 2, size=300_000, dtype=bool)
In [53]: %timeit np.diff(positions)
633 µs ± 4.09 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
我想在numba
中编写自己的差异会产生相似的性能.
I'd imagine that writing your own diff in numba
should yield similar performance.
最后一条语句是错误的,我使用numba
实现了一个简单的diff函数,它比numpy
快10倍以上(但显然它的功能要少得多,但应该足以完成此任务):
The last statement is false, I implemented a simple diff function using numba
, and it's more than a factor of 10 faster than the numpy
one (but it obviously also has much less features, but should be sufficient for this task):
@numba.njit
def ndiff(x):
s = x.size - 1
r = np.empty(s, dtype=x.dtype)
for i in range(s):
r[i] = x[i+1] - x[i]
return r
In [68]: np.all(ndiff(positions) == np.diff(positions))
Out[68]: True
In [69]: %timeit ndiff(positions)
46 µs ± 138 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
这篇关于为什么Numba无法改善此递归功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!