特定容器中的numpy数组的元素数

特定容器中的numpy数组的元素数

本文介绍了特定容器中的numpy数组的元素数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个不等长(例如M0M1M2)的排序(一维)数组的集合.我想找到每个数组中有多少个元素在特定数字范围内(其中,数字范围是由另一个排序数组中的相邻元素指定的,例如zbin ).我想知道什么是最快的方法.

I have an ensemble of sorted (one-dimensional) arrays of unequal lengths (say M0, M1 and M2). I want to find how many elements of each of these arrays is inside specific number ranges (where the number ranges are specified by neighboring elements from another sorted array, say zbin). I want to know what is the fastest way to achieve this.

在这里,我举一个我想做的任务的小例子(以及我目前正在使用的实现所需功能的方法):

Here, I am giving a small example of the task that I want to do (and also the method that I am following presently to achieve the desired functionality):

""" Function to do search query """
def search(numrange, lst):
    arr = np.zeros(len(lst))
    for i in range(len(lst)):
        probe = lst[i]
        count = 0
        for j in range(len(probe)):
            if (probe[j]>numrange[1]): break
            if (probe[j]>=numrange[0]) and (probe[j]<=numrange[1]): count = count + 1

        arr[i] = count
    return arr


""" Some example of sorted one-dimensional arrays of unequal lengths """
M0 = np.array([5.1, 5.4, 6.4, 6.8, 7.9])
M1 = np.array([5.2, 5.7, 8.8, 8.9, 9.1, 9.2])
M2 = np.array([6.1, 6.2, 6.5, 7.2])

""" Implementation and output """
lst = [M0, M1, M2]
zbin = np.array([5.0, 5.5, 6.0, 6.5])
zarr = np.zeros( (len(zbin)-1, len(lst)) )
for i in range(len(zbin)-1):
    numrange = [zbin[i], zbin[i+1]]
    zarr[i,:] = search(numrange, lst)

print zarr

输出:

[[ 2.  1.  0.]
 [ 0.  1.  0.]
 [ 1.  0.  3.]]

在这里,最终输出zarr给出了每个容器中zbin( viz)中每个数组(M0M1M2)的元素数量 [5.0, 5.5][5.5, 6.0][6.0, 6.5].)例如,考虑垃圾箱[5.0, 5.5].数组M0在该bin中具有 2 个元素(5.15.4),M1具有 1 个元素(5.2)和M2该容器中有 0 个元素.这给出了zarr的第一行,即[2,1,0].可以类似的方式获得zarr的其他行.

Here, the final output zarr gives me the number of elements of each of the arrays (M0, M1 and M2) inside each of the bins possible from zbin (viz. [5.0, 5.5], [5.5, 6.0] and [6.0, 6.5].) For example consider the bin [5.0, 5.5]. The array M0 has 2 elements inside that bin (5.1 and 5.4), M1 has 1 element (5.2) and M2 has 0 elements in that bin. This gives the first row of zarr i.e. [2,1,0]. One can get the other rows of zarr in a similar manner.

在我的实际任务中,我将处理长度大于本例中给出的长度的zbin,以及更大,更多的数组,例如M0M1... .所有M和数组zbin将始终进行排序.我想知道我设计的功能(search())和我遵循的方法是否是实现所需功能的最佳方法和最快方法.我将非常感谢您的帮助.

In my actual task, I will be dealing with zbin of lengths much larger than what I have given in this example, and also bigger and many more arrays like M0, M1, ... Mn. All Ms and the array zbin would be sorted always. I am wondering if the function that I have designed (search()), and the method that I am following are the most optimum and the fastest ways to achieve the desired functionality. I will really appreciate any help.

推荐答案

我们可以利用排序后的性质,因此可以使用 np.searchsorted 来完成此任务,就像这样-

We could make use of the sorted nature and hence use np.searchsorted for this task, like so -

out = np.empty((len(zbin)-1, len(lst)),dtype=int)
for i,l in enumerate(lst):
    left_idx = np.searchsorted(l, zbin[:-1], 'left')
    right_idx = np.searchsorted(l, zbin[1:], 'right')
    out[:,i] = right_idx - left_idx

这篇关于特定容器中的numpy数组的元素数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-31 23:16