本文介绍了从蒙版中高效提取numpy子数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在寻找一种pythonic方法,如示例所示,使用掩码从给定数组中提取多个子数组:
I am searching a pythonic way to extract multiple subarrays from a given array using a mask as shown in the example:
a = np.array([10, 5, 3, 2, 1])
m = np.array([True, True, False, True, True])
输出将是如下数组的集合,其中只有掩码m
的True值(彼此相邻的True值)的连续区域"表示生成子数组的索引.
The output will be a collection of array like the following, where only the contiguous "region" of True values (True values next to each other) of the mask m
represent the indices generating a subarray.
L[0] = np.array([10, 5])
L[1] = np.array([2, 1])
推荐答案
这是一种方法-
def separate_regions(a, m):
m0 = np.concatenate(( [False], m, [False] ))
idx = np.flatnonzero(m0[1:] != m0[:-1])
return [a[idx[i]:idx[i+1]] for i in range(0,len(idx),2)]
样品运行-
In [41]: a = np.array([10, 5, 3, 2, 1])
...: m = np.array([True, True, False, True, True])
...:
In [42]: separate_regions(a, m)
Out[42]: [array([10, 5]), array([2, 1])]
运行时测试
其他方法-
# @kazemakase's soln
def zip_split(a, m):
d = np.diff(m)
cuts = np.flatnonzero(d) + 1
asplit = np.split(a, cuts)
msplit = np.split(m, cuts)
L = [aseg for aseg, mseg in zip(asplit, msplit) if np.all(mseg)]
return L
时间-
In [49]: a = np.random.randint(0,9,(100000))
In [50]: m = np.random.rand(100000)>0.2
# @kazemakase's's solution
In [51]: %timeit zip_split(a,m)
10 loops, best of 3: 114 ms per loop
# @Daniel Forsman's solution
In [52]: %timeit splitByBool(a,m)
10 loops, best of 3: 25.1 ms per loop
# Proposed in this post
In [53]: %timeit separate_regions(a, m)
100 loops, best of 3: 5.01 ms per loop
增加岛屿的平均长度-
In [58]: a = np.random.randint(0,9,(100000))
In [59]: m = np.random.rand(100000)>0.1
In [60]: %timeit zip_split(a,m)
10 loops, best of 3: 64.3 ms per loop
In [61]: %timeit splitByBool(a,m)
100 loops, best of 3: 14 ms per loop
In [62]: %timeit separate_regions(a, m)
100 loops, best of 3: 2.85 ms per loop
这篇关于从蒙版中高效提取numpy子数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!