python - 在二进制numpy矩阵中将连续的1的块翻转到特定大小

我正在做一个图像分析项目。我已经将感兴趣的图片（NxM numpy数组）转换为二进制格式。矩阵中的“ 1”是关注区域。存在感兴趣的区域，并且存在无法代表图像特征的噪点。例如，在图像的水平快照中，我不关心孤立的1或2个组，最多5个连续的1。我想找到一种快速的方法来翻转它们（即使它们= 0）。

我的MWE用于翻转孤立的1：

import numpy as np
img = np.random.choice([0,1],size=(1000,1000), p=[1./2,1./2])

#now we take the second derivative of the matrix in the horizontal axis
#since we have a binary matrix, an isolated 1, that is [...010...] is captured
#by a second derivative entry equal to -2
#because ([...010...]->dx->[...1,-1,...]->dx->[...-2...]

ddx_img = np.diff(np.diff(img,1),1)
to_flip = np.where(ddx_img==-2) #returns a tuple of [x,y] matrix entries

# the second derivative eats up an index position on horizontally, so I need to add
# +1 to the horizontal axis of the tuple

temp_copy = to_flip[1].copy() #cannot modify tuple directly, for some reason its read only
temp_copy+=1
to_flip = (to_flip[0],temp_copy)

#now we can flip the entries by adding +1 to the entries to flip and taking mod 2
img[to_flip]=mod(img[to_flip]+1,2)

这在我的机器上花费大约9毫秒。我最多可以执行1秒的例程。

我欢迎任何对代码的批评（我不是一个优秀的python程序员），以及任何有关如何有效扩展此过程以消除连续的1的孤立孤岛到通用大小S孤岛的想法。

提前致谢

编辑：我意识到国防部是不必要的。在执行此操作时，我还想翻转太小的0岛。一个人可以用== 0代替= mod ....

最佳答案

特定问题的案例

编辑之后，似乎可以使用一些slicing，从而避免制作中间副本以提高性能。这是两行代码，可实现所需的输出-

# Calculate second derivative
ddx_img = np.diff(np.diff(img,1),1)

# Get sliced version of img excluding the first and last columns
# and use mask with ddx elements as "-2" to zeros
img[:,1:-1][ddx_img==-2] = 0

运行时测试并验证结果-

In [42]: A = np.random.choice([0,1],size=(1000,1000), p=[1./2,1./2])

In [43]: def slicing_based(A):
    ...:    img = A.copy()
    ...:    ddx_img = np.diff(np.diff(img,1),1)
    ...:    img[:,1:-1][ddx_img==-2] = 0
    ...:    return img
    ...:
    ...:
    ...: def original_approach(A):
    ...:
    ...:    img = A.copy()
    ...:
    ...:    ddx_img = np.diff(np.diff(img,1),1)
    ...:    to_flip = np.where(ddx_img==-2)
    ...:
    ...:    temp_copy = to_flip[1].copy()
    ...:    temp_copy+=1
    ...:    to_flip = (to_flip[0],temp_copy)
    ...:
    ...:    img[to_flip] = 0
    ...:
    ...:    return img
    ...:

In [44]: %timeit slicing_based(A)
100 loops, best of 3: 15.3 ms per loop

In [45]: %timeit original_approach(A)
10 loops, best of 3: 20.1 ms per loop

In [46]: np.allclose(slicing_based(A),original_approach(A))
Out[46]: True

一般情况

为了使解决方案通用，可以使用一些信号处理，特别是2D convolution，如下所示：

# Define kernel
K1 = np.array([[0,1,1,0]]) # Edit this for different island lengths
K2 = 1-K1

# Generate masks of same shape as img amd based on TRUE and inverted versions of
# kernels being convolved and those convolved sums being compared against the
# kernel sums indicating those spefic positions have fulfiled both the ONES
# and ZEROS criteria
mask1 = convolve2d(img, K1, boundary='fill',fillvalue=0, mode='same')==K1.sum()
mask2 = convolve2d(img==0, K2, boundary='fill',fillvalue=0, mode='same')==K2.sum()

# Use a combined mask to create that expanses through the kernel length
# and use it to set those in img to zeros
K3 = np.ones((1,K1.size))
mask3 = convolve2d(mask1 & mask2, K3, boundary='fill',fillvalue=0, mode='same')>0
img_out = img*(~mask3)

样本输入，输出-

In [250]: img
Out[250]:
array([[0, 1, 1, 1, 0, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 0, 1],
       [1, 0, 1, 1, 1, 1, 0, 0],
       [1, 1, 1, 1, 0, 1, 0, 1],
       [1, 1, 0, 1, 1, 0, 1, 1],
       [1, 0, 1, 1, 1, 1, 1, 1],
       [1, 1, 0, 1, 1, 0, 1, 0],
       [1, 1, 1, 0, 1, 1, 1, 1]])

In [251]: img_out
Out[251]:
array([[0, 1, 1, 1, 0, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 0, 1],
       [1, 0, 1, 1, 1, 1, 0, 0],
       [1, 1, 1, 1, 0, 1, 0, 1],
       [1, 1, 0, 0, 0, 0, 0, 1],
       [1, 0, 1, 1, 1, 1, 1, 1],
       [1, 1, 0, 0, 0, 0, 0, 0],
       [1, 1, 1, 0, 1, 1, 1, 1]])