问题描述
据我了解,为了通过梯度下降来更新模型参数,该算法需要在某个点上计算误差函数E相对于输出y的导数:dE/dy.不过,我已经看到,如果您想在Keras中使用自定义损失函数,则只需定义E即可,而无需定义其导数.我想念什么?
To my understanding, in order to update model parameters through gradient descend, the algorithm needs to calculate at some point the derivative of the error function E with respect of the output y: dE/dy. Nevertheless, I've seen that if you want to use a custom loss function in Keras, you simply need to define E and you don't need to define its derivative. What am I missing?
每个丢失的函数将具有不同的导数,例如:
Each lost function will have a different derivative, for example:
如果损失函数是均方误差:dE/dy = 2(y_true-y)
If loss function is the mean square error: dE/dy = 2(y_true - y)
如果损失函数是交叉熵:dE/dy = y_true/y
If loss function is cross entropy: dE/dy = y_true/y
再次,该模型怎么可能不问我导数是什么?该模型如何仅从E的值计算出损失函数相对于参数的梯度?
Again, how is it possible that the model does not ask me what the derivative is? How does the model calculate the gradient of the loss function with respect of parameters from just the value of E?
谢谢
推荐答案
据我了解,只要您要在Error函数中使用的每个运算符都已经预定义了渐变.基础框架将设法计算损失函数的梯度.
To my understanding, as long as each operator that you will use in your Error function has already a predefined gradient. the underlying framework will manage to calculate the gradient of you loss function.
这篇关于为什么Keras不需要自定义损失函数的梯度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!