可能有人问过这个问题,但我找不到。
从数据集中不断获取批量数据的最简单方法是什么?是否有内置的tensorflow函数来做到这一点?
例如:

for i in num_trains: x_batch, y_batch = get_batch(x_train, y_train, batch_size) sess.run(train_step, feed_dict={x:x_batch,y:y_batch})

如果没有这样的内置函数,您将如何实现?我尝试过自己,但是我不知道每次调用该函数时如何获得不同于先前批次的新批次。

谢谢!

最佳答案

你可以试试:

# Feed batch data
def get_batch(inputX, inputY, batch_size):
   duration = len(inputX)
   for i in range(0,duration//batch_size):
     idx = i*batch_size
     yield inputX[idx:idx+batch_size], inputY[idx:idx+batch_size]


您也可以使用tensorflow的dataset API

dataset = tf.data.Dataset.from_tensor_slices((train_x, train_y))
dataset = dataset.batch(batch_size)


获取批处理:

  X = np.arange(100)
  Y = X
 batch = get_batch(X, Y, 5)
 batch_x, batch_y = next(batch)
 print(batch_x, batch_y)
 #[0 1 2 3 4] [0 1 2 3 4]

 batch_x, batch_y = next(batch)
 print(batch_x, batch_y)
 #[5 6 7 8 9] [5 6 7 8 9]


通常,要运行多个epochs的数据集,您需要执行以下操作:

 for epoch in range(number of epoch):
    for step in range(size_of_dataset//batch_size):
        for x_batch, y_batch in get_batch(x_train, y_train, batch_size):
           sess.run(train_step, feed_dict={x:x_batch,y:y_batch})


使用dataset API

  dataset = tf.data.Dataset.from_tensor_slices((X, Y))
  dataset = dataset.batch(5)
  iterator = dataset.make_initializable_iterator()
  train_x, train_y = iterator.get_next()
  with tf.Session() as sess:
    sess.run(iterator.initializer)
   for i in range(2):
       print(sess.run([train_x, train_y]))
   #[array([0, 1, 2, 3, 4]), array([0, 1, 2, 3, 4])]
   #[array([5, 6, 7, 8, 9]), array([5, 6, 7, 8, 9])]

关于python-3.x - 在tensorflow中获取批次,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/50539342/

10-12 16:51