本文介绍了为什么np.compress比布尔索引更快?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

np.compress在内部做什么,使其比布尔索引更快?

What is np.compress doing internally that makes it faster than boolean indexing?

在此示例中,compress快20%,但是节省的时间取决于a的大小和布尔数组bTrue值的数量,但是在我的机器上总是更快.

In this example, compress is ~20% faster, but the time savings varies on the size of a and the number of True values in the boolean array b, but on my machine compress is always faster.

import numpy as np

a = np.random.rand(1000000,4)
b = (a[:,0]>0.5)

%timeit a[b]
#>>> 10 loops, best of 3: 24.7 ms per loop
%timeit a.compress(b, axis=0)
#>>> 10 loops, best of 3: 20 ms per loop

布尔索引的文档

相比之下,压缩文档

但是,使用此处提供的方法来确定两个数组是否共享相同的数据缓冲区表明,这两个方法均未与其父级a共享数据,这意味着两个方法均未返回实际切片.

However, using the method provided here for determining whether two arrays share the same data buffer shows that neither method shares data with its parent a, which I take to mean neither method returns an actual slice.

def get_data_base(arr):
    base = arr
    while isinstance(base.base, np.ndarray):
        base = base.base
    return base

def arrays_share_data(x, y):
    return get_data_base(x) is get_data_base(y)

arrays_share_data(a, a.compress(b, axis=0))
#>>> False
arrays_share_data(a, a[b])
#>>> False

我只是好奇,因为我在工作中经常执行这些操作.我运行的是通过Anaconda安装的python 3.5.2,numpy v 1.11.1.

I am simply curious because I perform these operations frequently in my work. I run python 3.5.2, numpy v 1.11.1, installed via Anaconda.

推荐答案

numpy github上找到的

/numpy/core/src/multiarray/item_selection.c
PyArray_Compress(PyArrayObject *self, PyObject *condition, int axis,
             PyArrayObject *out)
    # various checks
    res = PyArray_Nonzero(cond);
    ret = PyArray_TakeFrom(self, PyTuple_GET_ITEM(res, 0), axis,
                       out, NPY_RAISE);

对于示例数组,compress与执行where以获得索引数组相同,然后执行take:

With your sample arrays, compress is the same as doing where to get a index array, and then take:

In [135]: a.shape
Out[135]: (1000000, 4)
In [136]: b.shape
Out[136]: (1000000,)
In [137]: a.compress(b, axis=0).shape
Out[137]: (499780, 4)
In [138]: a.take(np.nonzero(b)[0], axis=0).shape
Out[138]: (499780, 4)
In [139]: timeit a.compress(b, axis=0).shape
100 loops, best of 3: 14.3 ms per loop
In [140]: timeit a.take(np.nonzero(b)[0], axis=0).shape
100 loops, best of 3: 14.3 ms per loop

实际上,如果我在[]索引中使用此索引数组,我得到的时间可比:

In fact if I use this index array in the [] indexing I get comparable times:

In [141]: idx=np.where(b)[0]
In [142]: idx.shape
Out[142]: (499780,)
In [143]: timeit a[idx,:].shape
100 loops, best of 3: 14.6 ms per loop
In [144]: timeit np.take(a,idx, axis=0).shape
100 loops, best of 3: 9.9 ms per loop

np.take代码更复杂,因为它包含clipwrap模式.

np.take code is more involved since it includes clip and wrap modes.

[]索引会转换为__getitem__调用,并且会经过不同的层.我没有追踪到代码的差异很大,但是我认为可以肯定地说compress(或更确切地说take)只是采用了一条更直接的路线来执行任务,因此速度得到了适度的提高. 30-50%的速度差异表明编译后的代码细节上的差异,而不是诸如viewscopies之类的主要内容,也不是解释与编译的主要内容.

[] indexing gets translated into a __getitem__ call, and through various layers. I haven't traced that code vary far, but I think it's safe to say that compress (or rather take) just takes a more direct route to the task, and thus gets a modest speed increase. A speed difference of 30-50% suggests differences in compiled code details, not something major like views vs copies, or interpreted vs compiled.

这篇关于为什么np.compress比布尔索引更快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-14 18:41