有哪些sequence model

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks-LMLPHP

Notation:

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks-LMLPHP

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks-LMLPHP

RNN - Recurrent Neural Network

传统NN 在解决sequence input 时有什么问题?

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks-LMLPHP

RNN就没有上面的问题. 注意这里还提到了BRNN 双向RNN的概念。

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks-LMLPHP

激活函数 g 经常用的是tanh, 也有用relu的但是不常用

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks-LMLPHP

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks-LMLPHP

Backpropagation through time

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks-LMLPHP

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks-LMLPHP

Difference types of RNNs

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks-LMLPHP

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks-LMLPHP

Language model and sequence generation

language modelling 用来找出可能性最大的句子.

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks-LMLPHP

Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks-LMLPHP

language model 训练好了以后,一个有趣的应用例子是自己创造句子, 也就是 sample novel sequences

Sample novel sequences

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks-LMLPHP

除了常见的word-level language model, 还有一种很不常见的character-level language model.

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks-LMLPHP

Vanishing gradient problem

因为RNN 每个word 最主要受到附近的word的影响,如果遇到下面图片里的setence 就处理不好. 遇到一个名词就需要记忆很久这个名词(cat)的单复数,直到遇到动词(was/were)这个不是RNN擅长的.

除了vanishing gradient 问题,还有exploding gradient 问题,但是相对来说 exploding gradient 好解决,solution 是gradient clipping, 具体是说gradient 的值太大了就clip according to max value (threshold).

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks-LMLPHP

  

GRU - Gated Recurrent Unit

接下来就谈怎么解决vanishing gradient 问题。

先来看basic RNN.

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks-LMLPHP

在对比着看GRU

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks-LMLPHP

上面是为了好理解做的简化版的GRU,Full GRU是这样的   

Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks-LMLPHP

LSTM 和GRU 怎么选择呢?没有优劣,不同的问题可能适用不同的算法。

LSTM 比 GRU 更复杂,但是GRU更简单所以更快。GRU 有两个gate, LSTM 有三个gate. 如果要选择一个,可以默认先选择LSTM

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks-LMLPHP

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks-LMLPHP

BRNN - Bidirection RNN

下面的问题需要BRNN来处理

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks-LMLPHP

实际应用中,BRNN + LSTM 的组合最常用

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks-LMLPHP

Deep RNNs

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks-LMLPHP

Questions:

1. gate 的概念没有理解

2. LSTM 没有理解

3. One-hot vector: 一个向量里只有一个1,其他都是0.

05-08 08:30