tensorflow - 在Keras(Tensorflow后端)中使用binary_crossentropy损失

在Keras文档中的培训示例中，

https://keras.io/getting-started/sequential-model-guide/#training

使用 binary_crossentropy ，并在网络的最后一层添加乙状结肠激活，但是是否有必要在最后一层中添加乙状结肠？正如我在源代码中发现的:

def binary_crossentropy(output, target, from_logits=False):
  """Binary crossentropy between an output tensor and a target tensor.
  Arguments:
      output: A tensor.
      target: A tensor with the same shape as `output`.
      from_logits: Whether `output` is expected to be a logits tensor.
          By default, we consider that `output`
          encodes a probability distribution.
  Returns:
      A tensor.
  """
  # Note: nn.softmax_cross_entropy_with_logits
  # expects logits, Keras expects probabilities.
  if not from_logits:
    # transform back to logits
    epsilon = _to_tensor(_EPSILON, output.dtype.base_dtype)
    output = clip_ops.clip_by_value(output, epsilon, 1 - epsilon)
    output = math_ops.log(output / (1 - output))
  return nn.sigmoid_cross_entropy_with_logits(labels=target, logits=output)

Keras在Tensorflow中调用 sigmoid_cross_entropy_with_logits ，但在 sigmoid_cross_entropy_with_logits 函数中，再次计算 sigmoid(logits)。

https://www.tensorflow.org/versions/master/api_docs/python/tf/nn/sigmoid_cross_entropy_with_logits

因此，我认为最后添加 sigmoid 是没有道理的，但似乎在网上找到的所有Keras二进制/多标签分类示例和教程最后都添加了 sigmoid 。另外我不明白什么是意思

# Note: nn.softmax_cross_entropy_with_logits
# expects logits, Keras expects probabilities.

为什么Keras期望概率？它不使用 nn.softmax_cross_entropy_with_logits 函数吗？是否有意义？

谢谢。

最佳答案

没错，这就是事实。我相信这是由于历史原因。

Keras是在 tensorflow 之前创建的，用作theano的包装。在theano中，必须手动计算S形/softmax，然后应用交叉熵损失函数。 Tensorflow可以在一个融合的操作中完成所有操作，但是社区已经采用了具有Sigmoid/softmax层的API。

如果要避免不必要的logit 概率转换，请使用binary_crossentropy调用from_logits=True损失，并且不要添加Sigmoid层。