python - 在卷积神经网络训练中定期评估大型测试集

我用TensorFlow创建了一个小的卷积神经网络，我想训练它。
在培训期间，我想记录几个指标。其中之一是独立于训练集的测试集的准确性。
MNIST的例子向我展示了如何做到这一点：

  # Train the model, and also write summaries.
  # Every 10th step, measure test-set accuracy, and write test summaries
  # All other steps, run train_step on training data, & add training summaries

  def feed_dict(train):
    """Make a TensorFlow feed_dict: maps data onto Tensor placeholders."""
    if train or FLAGS.fake_data:
      xs, ys = mnist.train.next_batch(100, fake_data=FLAGS.fake_data)
      k = FLAGS.dropout
    else:
      xs, ys = mnist.test.images, mnist.test.labels
      k = 1.0
    return {x: xs, y_: ys, keep_prob: k}

  for i in range(FLAGS.max_steps):
    if i % 10 == 0:  # Record summaries and test-set accuracy
      summary, acc = sess.run([merged, accuracy], feed_dict=feed_dict(False))
      test_writer.add_summary(summary, i)
      print('Accuracy at step %s: %s' % (i, acc))
    else: # Record train set summarieis, and train
      summary, _ = sess.run([merged, train_step], feed_dict=feed_dict(True))
      train_writer.add_summary(summary, i)

它所做的是每10步将整个测试集输入到评估中，并打印出这种准确性。
很酷，但是我的测试集要大一点。我有大约2000个30x30x30x8维度的“图像”，因此将所有这些数据集一次输入到计算中会炸掉我的核心内存和GPU内存。
作为解决办法，我有：

accuracy = mymodel.accuracy(logits, label_placeholder)

test_accuracy_placeholder = tf.placeholder(tf.float32, name="test_accuracy")
test_summary = tf.scalar_summary("accuracy", test_accuracy_placeholder)


# training loop
for batch_idx in enumerate(batches_in_trainset):

    #do training here
    ...

    # check accuracy every 10 examples
    if batch_idx % 10 == 0:

        test_accuracies = []  # start with empty accuracy list

        # inner testing loop
        for test_batch_idx in range(batches_in_testset):
            # get testset batch
            labels, images = testset.next_batch()

            # make feed dict
            feed_dict = {
                # ...
            }

            # calculate accuracy
            test_accuracy_val = sess.run(accuracy, feed_dict=test_feed_dict)

            # append accuracy to the list of test accuracies
            test_accuracies.append(test_accuracy_val)

        # "calculate" and log the average accuracy over all test batches
        summary_str = sess.run(test_summary,
                               feed_dict={
                                   test_accuracy_placeholder: sum(test_accuracies) / len(test_accuracies)})

        test_writer.add_summary(summary_str)

基本上，我首先收集测试集批处理的所有精度，然后将它们输入第二个（断开连接的）图表，该图表计算这些批处理的平均值。
从某种意义上说，我确实能够在规定的时间间隔内计算出测试集的精度。
然而，这感觉非常尴尬，并且有一个严重的缺点，即除了测试集的准确性之外，我不能记录其他任何东西。
例如，我还想记录整个测试集上的损失函数值，整个测试集上的激活直方图，以及一些其他变量。
最好这应该像MNIST示例中那样工作。在这里查看TensorBoard演示：https://www.tensorflow.org/tensorboard/index.html#events
在这个总结中，所有的变量和度量都在测试集和训练集上进行评估。我也想要！但我不想把完整的测试集输入到我的模型中。

最佳答案

看起来这个函数是用流度量计算（contrib）添加的。
https://www.tensorflow.org/api_guides/python/contrib.metrics