computer-vision - tf.nn.max_pool的ksize参数用于什么？

在tf.nn.max_pool的定义中，ksize的作用是什么？

tf.nn.max_pool(value, ksize, strides, padding, data_format='NHWC', name=None)

Performs the max pooling on the input.

Args:

value: A 4-D Tensor with shape [batch, height, width, channels] and type    tf.float32.
ksize: A list of ints that has length >= 4. The size of the window for each dimension of the input tensor.

例如，如果input value是tensor : [1, 64, 64, 3]和ksize=3，那是什么意思？

最佳答案

documentation指出:

通常，对于图像，对于64x64像素的RGB图像，输入的形状为[batch_size, 64, 64, 3]。

如果您有一个最大的2x2窗口，则内核大小ksize通常为[1, 2, 2, 1]。在批量大小维度和 channel 维度上，ksize是1，因为我们不想在多个示例或多个 channel 上采用最大值。

关于computer-vision - tf.nn.max_pool的ksize参数用于什么？，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/38601452/