I try to build a 3-layer RNN with Keras. Part of the code is here:

    model = Sequential()
    model.add(Embedding(input_dim = 91, output_dim = 128, input_length =max_length))
    model.add(GRUCell(units = self.neurons, dropout = self.dropval,  bias_initializer = bias))
    model.add(GRUCell(units = self.neurons, dropout = self.dropval,  bias_initializer = bias))
    model.add(GRUCell(units = self.neurons, dropout = self.dropval,  bias_initializer = bias))


call() missing 1 required positional argument: 'states'


~/anaconda3/envs/hw3/lib/python3.5/site-packages/keras/ in add(self, layer)
487                           output_shapes=[self.outputs[0]._keras_shape])
488         else:
--> 489             output_tensor = layer(self.outputs[0])
490             if isinstance(output_tensor, list):
491                 raise TypeError('All layers in a Sequential model '

 ~/anaconda3/envs/hw3/lib/python3.5/site-packages/keras/engine/ in __call__(self, inputs, **kwargs)
602             # Actually call the layer, collecting output(s), mask(s), and shape(s).
--> 603             output =, **kwargs)
604             output_mask = self.compute_mask(inputs, previous_mask)


  1. 请勿在Keras中直接使用Cell类(即GRUCellLSTMCell).它们是计算单元,由相应的层包裹.而是使用Layer类(即GRULSTM):

  1. Don't use Cell classes (i.e. GRUCell or LSTMCell) in Keras directly. They are computation cells which are wrapped by the corresponding layers. Instead use the Layer classes (i.e. GRU or LSTM):

model.add(GRU(units = self.neurons, dropout = self.dropval,  bias_initializer = bias))
model.add(GRU(units = self.neurons, dropout = self.dropval,  bias_initializer = bias))
model.add(GRU(units = self.neurons, dropout = self.dropval,  bias_initializer = bias))

LSTMGRU使用它们相应的单元格在所有时间步上执行计算.阅读此 SO答案以了解有关它们的区别的更多信息.

The LSTM and GRU use their corresponding cells to perform computations over the all timesteps. Read this SO answer to learn more about their difference.


When you are stacking multiple RNN layers on top of each other you need to set their return_sequences argument to True in order to produce the output of each timestep, which in turn is used by the next RNN layer. Note that you may or may not do this on the last RNN layer (it depends on your architecture and the problem you are trying to solve):

model.add(GRU(units = self.neurons, dropout = self.dropval,  bias_initializer = bias, return_sequences=True))
model.add(GRU(units = self.neurons, dropout = self.dropval,  bias_initializer = bias, return_sequences=True))
model.add(GRU(units = self.neurons, dropout = self.dropval,  bias_initializer = bias))

