问题描述
我正在尝试使用RELU实现神经网络.
I am trying to implement neural network with RELU.
输入层-> 1个隐藏层-> relu->输出层-> softmax层
input layer -> 1 hidden layer -> relu -> output layer -> softmax layer
以上是我的神经网络的体系结构.我对这个relu的反向传播感到困惑.对于RELU的导数,如果x <= 0,则输出为0.如果x> 0,则输出为1.因此,当您计算梯度时,是否意味着如果x< = 0,我就杀死了梯度体面的人?
Above is the architecture of my neural network.I am confused about backpropagation of this relu.For derivative of RELU, if x <= 0, output is 0.if x > 0, output is 1.So when you calculate the gradient, does that mean I kill gradient decent if x<=0?
有人可以逐步解释我的神经网络架构的反向传播吗?
Can someone explain the backpropagation of my neural network architecture 'step by step'?
推荐答案
ReLU函数定义为:对于x> 0,输出为x,即 f(x)= max(0,x)
The ReLU function is defined as: For x > 0 the output is x, i.e. f(x) = max(0,x)
所以对于导数f'(x)实际上是:
So for the derivative f '(x) it's actually:
如果x< 0,输出为0.如果x> 0,则输出为1.
if x < 0, output is 0. if x > 0, output is 1.
未定义导数f'(0).因此通常将其设置为0,或者将激活函数修改为f(x)= max(e,x)(对于较小的e).
The derivative f '(0) is not defined. So it's usually set to 0 or you modify the activation function to be f(x) = max(e,x) for a small e.
通常:ReLU是使用整流器激活功能的单元.这意味着它的工作原理与其他任何隐藏层完全相同,但除了tanh(x),sigmoid(x)或您使用的任何激活方法外,您将改为使用f(x)= max(0,x).
Generally: A ReLU is a unit that uses the rectifier activation function. That means it works exactly like any other hidden layer but except tanh(x), sigmoid(x) or whatever activation you use, you'll instead use f(x) = max(0,x).
如果您已经为具有S形激活功能的多层网络编写了代码,则字面上的变化是1行.关于正向传播或反向传播,在算法上没有任何改变.如果还没有运行更简单的模型,请先回过头来开始.否则,您的问题不是真正关于ReLU,而是关于整体实现NN.
If you have written code for a working multilayer network with sigmoid activation it's literally 1 line of change. Nothing about forward- or back-propagation changes algorithmically. If you haven't got the simpler model working yet, go back and start with that first. Otherwise your question isn't really about ReLUs but about implementing a NN as a whole.
这篇关于使用RELU的神经网络反向传播的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!