Skip to Content
Generating Text with Deep Learning
Decoder Training Setup

The decoder looks a lot like the encoder (phew!), with an input layer and an LSTM layer that we use together:

decoder_inputs = Input(shape=(None, num_decoder_tokens)) decoder_lstm = LSTM(100, return_sequences=True, return_state=True) # This time we care about full return sequences

However, with our decoder, we pass in the state data from the encoder, along with the decoder inputs. This time, we’ll keep the output instead of the states:

# The two states will be discarded for now decoder_outputs, decoder_state_hidden, decoder_state_cell = decoder_lstm(decoder_inputs, initial_state=encoder_states)

We also need to run the output through a final activation layer, using the Softmax function, that will give us the probability distribution — where all probabilities sum to one — for each token. The final layer also transforms our LSTM output from a dimensionality of whatever we gave it (in our case, 10) to the number of unique words within the hidden layer’s vocabulary (i.e., the number of unique target tokens, which is definitely more than 10!).

decoder_dense = Dense(num_decoder_tokens, activation='softmax') decoder_outputs = decoder_dense(decoder_outputs)

Keras’s implementation could work with several layer types, but Dense is the least complex, so we’ll go with that. We also need to modify our import statement to include it before running the code:

from keras.layers import Input, LSTM, Dense



If you take a look at, you’ll see that we’ve already set up the decoder input and LSTM layers for you.

Now it’s your turn to grab the decoder_outputs, decoder_state_hidden, and decoder_state_cell by calling the decoder LSTM layer on decoder_inputs. This time though, pass in the encoder_states as the initial_state.


Alright, time for something new! Add Dense to the layer types you’re importing from Keras.

Then, build the final Dense layer decoder_dense layer. Pass in the following arguments:

  • the number of decoder tokens
  • an activation of "softmax"

Run those decoder_outputs through the Dense layer you just created. Assign the resulting value to decoder_outputs.

Folder Icon

Sign up to start coding

Already have an account?