


I realize that this is probably a very niche question, but has anyone had experience with working with continuous neural networks? I'm specifically interested in what a continuous neural network may be useful for vs what you normally use discrete neural networks for.

为了清楚起见,我将澄清什么,我连续神经网络的意思,我想这可能是PTED意味着不同的事情跨$ P $。我做的不可以意味着激活函数是连续的。而我暗示的增加隐层神经元的数量是无限的想法。

For clarity I will clear up what I mean by continuous neural network as I suppose it can be interpreted to mean different things. I do not mean that the activation function is continuous. Rather I allude to the idea of a increasing the number of neurons in the hidden layer to an infinite amount.

所以为了清楚起见,这里是典型的谨慎NN的结构:该 X 是输入,先按g 是隐层的激活时, v 是隐层的权重,在是W 是输出层的权重,在 B 是偏见,显然输出层有一个线性激活(即无。)

So for clarity, here is the architecture of your typical discreet NN:The x are the input, the g is the activation of the hidden layer, the v are the weights of the hidden layer, the w are the weights of the output layer, the b is the bias and apparently the output layer has a linear activation (namely none.)


The difference between a discrete NN and a continuous NN is depicted by this figure:That is you let the number of hidden neurons become infinite so that your final output is an integral. In practice this means that instead of computing a deterministic sum you instead must approximate the corresponding integral with quadrature.


Apparently its a common misconception with neural networks that too many hidden neurons produces over-fitting.


My question is specifically, given this definition of discrete and continuous neural networks, I was wondering if anyone had experience working with the latter and what sort of things they used them for.


Further description on the topic can be found here:http://www.iro.umontreal.ca/~lisa/seminaires/18-04-2006.pdf



In the past I've worked on a few research projects using continuous NN's. Activation was done using a bipolar hyperbolic tan, the network took several hundred floating point inputs and output around one hundred floating point values.


In this particular case the aim of the network was to learn the dynamic equations of a mineral train. The network was given the current state of the train and predicted speed, inter-wagon dynamics and other train behaviour 50 seconds into the future.


The rationale for this particular project was mainly about performance. This was being targeted for an embedded device and evaluating the NN was much more performance friendly then solving a traditional ODE (ordinary differential equation) system.


In general a continuous NN should be able to learn any kind of function. This is particularly useful when its impossible/extremely difficult to solve a system using deterministic methods. As opposed to binary networks which are often used for pattern recognition/classification purposes.


Given their non-deterministic nature NN's of any kind are touchy beasts, choosing the right kinds of inputs/network architecture can be somewhat a black art.


08-20 09:08