查找指数numpy的阵列地划分子阵列cumsum

查找指数numpy的阵列地划分子阵列cumsum

本文介绍了查找指数numpy的阵列地划分子阵列cumsum的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给定一个数组数组和一组指标指标的,我怎么找到沿着分割这些索引数组中的一个量化的方式形成的子阵的累积和?
为了澄清,假设我有:

Given an array 'array' and a set of indices 'indices', how do I find the cumulative sum of the sub-arrays formed by splitting the array along those indices in a vectorized manner?To clarify, suppose I have:

>>> array = np.arange(20)
>>> array
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19])
indices = np.arrray([3, 8, 14])

操作时应输出:

array([0, 1, 3, 3, 7, 12, 18, 25, 8, 17, 27, 38, 50, 63, 14, 29, 45, 62, 80, 99])

请注意,数组是非常大的(100000元),因此,我需要一个量化的答案。使用任何环路将大大慢下来。
另外,如果我有同样的问题,而是一个二维数组及相应指数,而我需要做同样的事情在阵列中的每一行,我将如何做呢?

Please note that the array is very big (100000 elements) and so, I need a vectorized answer. Using any loops would slow it down considerably.Also, if I had the same problem, but a 2D array and corresponding indices, and I need to do the same thing for each row in the array, how would I do it?

有关2D版:

>>>array = np.arange(12).reshape((3,4))
>>>array
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
>>> indices = np.array([[2], [1, 3], [1, 2]])

将输出:

array([[ 0,  1,  3,  3],
       [ 4,  9,  6, 13],
       [ 8, 17, 10, 11]])

要澄清:每一行都将被分割

To clarify: Every row will be split.

推荐答案

您可以在指数介绍最初累加得到阵列的分化位置,以创建像那些效果的边界的地方,使得当所述分化阵列累加得到,为我们提供了指数-停止累加得到的输出。这可能会觉得有点的做作的在第一的外观,但坚持下来,试着用其他样品,希望会有意义!这个想法是非常相似,提供的 -

You can introduce differentiation of originally cumulatively summed array at indices positions to create a boundary like effect at those places, such that when the differentiated array is cumulatively summed, gives us the indices-stopped cumulatively summed output. This might feel a bit contrived at first-look, but stick with it, try with other samples and hopefully would make sense! The idea is very similar to the one applied in this other MATLAB solution. So, following such a philosophy here's one approach using numpy.diff along with cumulative summation -

# Get linear indices
n = array.shape[1]
lidx = np.hstack(([id*n+np.array(item) for id,item in enumerate(indices)]))

# Get successive differentiations
diffs = array.cumsum(1).ravel()[lidx] - array.ravel()[lidx]

# Get previous group's offsetted summations for each row at all
# indices positions across the entire 2D array
_,idx = np.unique(lidx/n,return_index=True)
offsetted_diffs = np.diff(np.append(0,diffs))
offsetted_diffs[idx] = diffs[idx]

# Get a copy of input array and place previous group's offsetted summations
# at indices. Then, do cumulative sum which will create a boundary like
# effect with those offsets at indices positions.
arrayc = array.copy()
arrayc.ravel()[lidx] -= offsetted_diffs
out = arrayc.cumsum(1)

这应该是一个的几乎的矢量化的解决方案,几乎因为即使我们计算在一个循环的线性指标,但由于它不在这儿计算密集型的一部分​​,所以它的总运行时间效果会最小的。此外,您还可以替换 arrayc 阵列如果你不在乎自毁输入保存在内存中。

This should be an almost vectorized solution, almost because even though we are calculating linear indices in a loop, but since it's not the computationally intensive part here, so it's effect on the total runtime would be minimal. Also, you can replace arrayc with array if you don't care about destructing the input for saving on memory.

样的输入,输出 -

Sample input, output -

In [75]: array
Out[75]:
array([[ 0,  1,  2,  3,  4,  5,  6,  7],
       [ 8,  9, 10, 11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20, 21, 22, 23]])

In [76]: indices
Out[76]: array([[3, 6], [4, 7], [5]], dtype=object)

In [77]: out
Out[77]:
array([[ 0,  1,  3,  3,  7, 12,  6, 13],
       [ 8, 17, 27, 38, 12, 25, 39, 15],
       [16, 33, 51, 70, 90, 21, 43, 66]])

这篇关于查找指数numpy的阵列地划分子阵列cumsum的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-24 09:57