Understanding Seq2Seq Neural Networks – Part 5: Decoding the Context Vector
Source: Dev.to

In the previous article, we stopped at the concept of the context vector.
In this article, we will start by decoding the context vector.
Connecting the Decoder
The first thing we need to do is connect the long‑term and short‑term memories (the cell states and hidden states) that form the context vector to a new set of LSTMs.
- Like the encoder, the decoder also has two layers, each containing two LSTM cells.
- The LSTMs in the decoder have their own separate weights and biases, distinct from those in the encoder.
Using the Context Vector
The context vector is used to initialize the long‑term and short‑term memories (the cell states and hidden states) in the decoder’s LSTMs.
This initialization lets the decoder start with the information learned from the input sentence.
Goal of the Decoder
The ultimate goal of the decoder is to convert the context vector into the output sentence.
- The encoder understands the input.
- The decoder generates the output based on that understanding.
Decoder Inputs
Similar to the encoder, the input to the LSTM cells in the first layer comes from an embedding layer.
The embedding layer creates embedding values for Spanish words, such as:
- ir
- vamos
- y
- ** (End of Sentence symbol)
Each word is treated as a token, and the embedding layer converts them into numbers that the neural network can process.
We will explore the details of how the decoder generates the output sentence in the next article.
ipm install repo-name