本文介绍了numpy数组分割的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 numpy 数组

import numpy as np

arr = np.arange(20).reshape(2,10)
arr[1,:] = 0
arr[1,2] = arr[1,5] = arr[1,7] = 1
print(arr)
>>>[[0 1 2 3 4 5 6 7 8 9]
>>> [0 0 1 0 0 1 0 1 0 0]]

我想提取重叠的数组,它们从 1 开始并在下一个 1 之后结束.预期输出:

I want to extract overlapping arrays, starting at a 1 and ending behind the next 1.Expected output:

[[0 1 2 3]
 [0 0 1 0]]

[[2 3 4 5 6]
 [1 0 0 1 0]]

[[5 6 7 8]
 [1 0 1 0]]

[[7 8 9]
 [1 0 0]]

此刻,我有一个基于索引的for循环,在 numpy 上下文中感到尴尬,并且还不得不将第一个和最后一个段视为特殊情况:

At the moment, I have an index-based for-loop that feels awkward in a numpy context and also has to treat the first and last segment as special cases:

arr[1,0] = 1
ind = list(np.where(arr[1,:]))[0]
print(ind)

for i, j in enumerate(ind):
    if not i:
        continue
    curr = np.copy(arr[:, ind[i-1]:j+2])
    print(curr)

#last segment
curr = np.copy(arr[:, j:])
print(curr)

这种方法给了我想要的输出,但是我不相信没有比这更简单的方法(尽管这里的风滚草反应可能表明了这一点).如果有一个更简单的熊猫解决方案,那也很好.理想情况下,输出是这些数组或类似数据结构的列表;输出数组不必单独返回.

This approach gives me the desired output but I cannot believe there is not a numpier way to achieve this (although the tumbleweed reaction here may indicate this). If there is an easier pandas solution, that would also be fine. The output is ideally a list of these arrays or a similar data structure; the output arrays don't have to be returned individually.

推荐答案

解决方案中有一部分是我最喜欢的,并不复杂:

There is a part of solution, my favorite and not complicated:

split_idx = np.flatnonzero(arr[1]) + 2
>>> np.split(arr, split_idx, axis=1)
[array([[0, 1, 2, 3],
        [0, 0, 1, 0]]),
 array([[4, 5, 6],
        [0, 1, 0]]),
 array([[7, 8],
        [1, 0]]),
 array([[9],
        [0]])]

但是有两件事表明此问题的任何 numpyic 方法的设计都不好:

But there are two things that indicates a bad design of any numpyic approach for this problem:

  • 您被迫使用非专为 numpy 设计的不同形状的列表.因此 np.split 相当慢.
  • 您不能一次循环一个数组.在内部物品的开头需要额外插入.
  • You're forced to work with lists of distinct shapes which is not designed for numpy. So np.split is quite slow.
  • You can't loop an array in one go. Extra insertions are needed at the beginnings of interior items.

这篇关于numpy数组分割的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-21 05:34