本文介绍了如何将numpy数组分解为较小的块/批次,然后遍历它们的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有这个numpy数组

Suppose i have this numpy array

[[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[10, 11, 12]]

我想将其分成两批,然后进行迭代:

And i want to split it in 2 batches and then iterate:

[[1, 2, 3],      Batch 1
[4, 5, 6]]

[[7, 8, 9],      Batch 2
[10, 11, 12]]

最简单的方法是什么?

很抱歉我错过了这样的信息:一旦我打算继续进行迭代,原始数组将由于拆分和迭代批次而被破坏.批处理迭代完成后,我需要从第一个批处理重新启动,因此我应该保留原始数组不会被破坏.整个想法是与随机梯度下降算法一致的,该算法需要分批迭代.在一个典型的示例中,我可能有一个100000迭代的For循环,仅用于1000个批处理,应该一次又一次地重播.

I'm deeply sorry i missed putting such info: Once i intend to carry on with the iteration, the original array would be destroyed due to splitting and iterating over batches. Once the batch iteration finished, i need to restart again from the first batch hence I should preserve that the original array wouldn't be destroyed. The whole idea is to be consistent with Stochastic Gradient Descent algorithms which require iterations over batches. In a typical example, I could have a 100000 iteration For loop for just 1000 batch that should be replayed again and again.

推荐答案

考虑数组a

a = np.array([[1, 2, 3],
              [4, 5, 6],
              [7, 8, 9],
              [10, 11, 12]])

选项1
使用reshape//

Option 1
use reshape and //

a.reshape(a.shape[0] // 2, -1, a.shape[1])

array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

选项2
如果您要两组而不是两组

Option 2
if you wanted groups of two rather than two groups

a.reshape(-1, 2, a.shape[1])

array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

选项3
使用发电机

Option 3
Use a generator

def get_every_n(a, n=2):
    for i in range(a.shape[0] // n):
        yield a[n*i:n*(i+1)]

for sa in get_every_n(a, n=2):
    print sa

[[1 2 3]
 [4 5 6]]
[[ 7  8  9]
 [10 11 12]]

这篇关于如何将numpy数组分解为较小的块/批次,然后遍历它们的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-20 13:04