问题描述
如果我的numpy.ndarray
大小为300点(现在为1 x 300),并且我想每30点选择10点,我该怎么做?
If I have a numpy.ndarray
that's, say, 300 points in size (1 x 300 for now), and I wanted to select 10 points every 30 points, how would I do that?
换句话说:我想要第一个10点,然后跳过20,然后再抓10个,然后跳过10 ...,直到数组结束.
In other words: I want the first 10 points, then skip 20, then grab 10 more, and then skip 10... until the end of the array.
推荐答案
要从30
个元素的每个块中选择10
个元素,我们可以简单地重塑为2D
并从中切出前几列10
列每行-
To select 10
elements off each block of 30
elements, we can simply reshape into 2D
and slice out the first 10
columns from each row -
a.reshape(-1,30)[:,:10]
好处是输出将是对输入的视图,因此实际上是免费的,并且没有任何额外的内存开销.让我们运行一个示例来展示和证明这些-
The benefit is the output would be a view into the input and as such virtually free and without any extra memory overhead. Let's have a sample run to show and prove those -
In [43]: np.random.seed(0)
In [44]: a = np.random.randint(0,9,(1,300))
In [48]: np.shares_memory(a,a.reshape(10,30)[0,:,:10])
Out[48]: True
如果需要拼合的版本,请使用.ravel()
-
If you need a flattened version, use .ravel()
-
a.reshape(-1,30)[:,:10].ravel()
时间-
In [38]: a = np.random.randint(0,9,(300))
# @sacul's soln
In [39]: %%timeit
...: msk = [True] * 10 + [False] * 20
...: out = a[np.tile(msk, len(a)//len(msk))]
100000 loops, best of 3: 7.6 µs per loop
# From this post
In [40]: %timeit a.reshape(-1,30)[:,:10].ravel()
1000000 loops, best of 3: 1.07 µs per loop
In [41]: a = np.random.randint(0,9,(3000000))
# @sacul's soln
In [42]: %%timeit
...: msk = [True] * 10 + [False] * 20
...: out = a[np.tile(msk, len(a)//len(msk))]
100 loops, best of 3: 3.66 ms per loop
# From this post
In [43]: %timeit a.reshape(-1,30)[:,:10].ravel()
100 loops, best of 3: 2.32 ms per loop
# If you are okay with `2D` output, it is virtually free
In [44]: %timeit a.reshape(-1,30)[:,:10]
1000000 loops, best of 3: 519 ns per loop
带有1D
数组的通用案例
A.元素数量是块长度的倍数
Generic case with 1D
array
A. No. of elements being multiple of block length
对于元素数量为n
倍数的1D
数组a
的数组,要从每个n
元素块中选择m
元素并获得1D
数组输出, :
For a 1D
array a
with number of elements being a multiple of n
, to select m
elements off each block of n
elements and get a 1D
array output, we would have :
a.reshape(-1,n)[:,:m].ravel()
请注意,ravel()
展平部分在此处进行复制.因此,如有可能,请保留未展平的2D
版本以提高内存效率.
Note that ravel()
flattening part makes a copy there. So, if possible keep the unflattened 2D
version for memory efficiency.
样品运行-
In [59]: m,n = 2,5
In [60]: N = 25
In [61]: a = np.random.randint(0,9,(N))
In [62]: a
Out[62]:
array([5, 0, 3, 3, 7, 3, 5, 2, 4, 7, 6, 8, 8, 1, 6, 7, 7, 8, 1, 5, 8, 4,
3, 0, 3])
# Select 2 elements off each block of 5 elements
In [63]: a.reshape(-1,n)[:,:m].ravel()
Out[63]: array([5, 0, 3, 5, 6, 8, 7, 7, 8, 4])
B.通用编号的元素
我们将利用受启发的 np.lib.stride_tricks.as_strided
通过 this post
从每个n
元素块中选择m
元素-
We would leverage np.lib.stride_tricks.as_strided
, inspired by this post
to select m
elements off each block of n
elements -
def skipped_view(a, m, n):
s = a.strides[0]
strided = np.lib.stride_tricks.as_strided
shp = ((a.size+n-1)//n,n)
return strided(a,shape=shp,strides=(n*s,s), writeable=False)[:,:m]
def slice_m_everyn(a, m, n):
a_slice2D = skipped_view(a,m,n)
extra = min(m,len(a)-n*(len(a)//n))
L = m*(len(a)//n) + extra
return a_slice2D.ravel()[:L]
请注意,skipped_view
使我们可以查看输入数组以及可能未分配给输入数组的内存区域的视图,但是此后,我们将进行展平和切片以将其限制为所需的输出,这就是一个副本.
Note that skipped_view
gets us a view into the input array and possibly into memory region not assigned to the input array, but after that we are flattening and slicing to restrict it to our desired output and that's a copy.
样品运行-
In [170]: np.random.seed(0)
...: a = np.random.randint(0,9,(16))
In [171]: a
Out[171]: array([5, 0, 3, 3, 7, 3, 5, 2, 4, 7, 6, 8, 8, 1, 6, 7])
# Select 2 elements off each block of 5 elements
In [172]: slice_m_everyn(a, m=2, n=5)
Out[172]: array([5, 0, 3, 5, 6, 8, 7])
In [173]: np.random.seed(0)
...: a = np.random.randint(0,9,(19))
In [174]: a
Out[174]: array([5, 0, 3, 3, 7, 3, 5, 2, 4, 7, 6, 8, 8, 1, 6, 7, 7, 8, 1])
# Select 2 elements off each block of 5 elements
In [175]: slice_m_everyn(a, m=2, n=5)
Out[175]: array([5, 0, 3, 5, 6, 8, 7, 7])
这篇关于NumPy:每m点选择n个点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!