为什么在反向传播神经网络中必须使用非线性激活函数?

本文介绍了为什么在反向传播神经网络中必须使用非线性激活函数?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我一直在阅读有关神经网络的一些知识，并且了解单层神经网络的一般原理.我知道需要附加层，但是为什么要使用非线性激活函数?

I've been reading some things on neural networks and I understand the general principle of a single layer neural network. I understand the need for aditional layers, but why are nonlinear activation functions used?

此问题后跟以下问题:

This question is followed by this one: What is a derivative of the activation function used for in backpropagation?

推荐答案

激活功能的目的是将 非线性引入网络

The purpose of the activation function is to introduce non-linearity into the network

反过来，这使您可以对响应变量(又称目标变量，类标签或分数)建模，该变量随其解释变量非线性变化

in turn, this allows you to model a response variable (aka target variable, class label, or score) that varies non-linearly with its explanatory variables

non-linear 表示不能从输入的线性组合中再现输出(这与呈直线的输出不同-单词是仿射).

non-linear means that the output cannot be reproduced from a linear combination of the inputs (which is not the same as output that renders to a straight line--the word for this is affine).

另一种思考方式:在网络中没有 nonlinear 激活函数的情况下，NN无论有多少层，其行为都将像单层感知器一样，因为对这些层求和将为您提供另一个线性函数(请参见上面的定义).

another way to think of it: without a non-linear activation function in the network, a NN, no matter how many layers it had, would behave just like a single-layer perceptron, because summing these layers would give you just another linear function (see definition just above).

>>> in_vec = NP.random.rand(10)
>>> in_vec
  array([ 0.94,  0.61,  0.65,  0.  ,  0.77,  0.99,  0.35,  0.81,  0.46,  0.59])

>>> # common activation function, hyperbolic tangent
>>> out_vec = NP.tanh(in_vec)
>>> out_vec
 array([ 0.74,  0.54,  0.57,  0.  ,  0.65,  0.76,  0.34,  0.67,  0.43,  0.53])

反向传播中常用的激活函数( 双曲正切 )的值从-2到2:

A common activation function used in backprop (hyperbolic tangent) evaluated from -2 to 2:

这篇关于为什么在反向传播神经网络中必须使用非线性激活函数?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！