Understanding LSTMs – Part 5: The Input Gate Explained

Published: (February 26, 2026 at 04:33 PM EST)
2 min read
Source: Dev.to

Source: Dev.to

Input Gate Explained

In the previous article, we went through the second and third components of an LSTM. We will deepen that understanding here.

Starting with the block furthest to the right, we multiply the short‑term memory and the input by their respective weights. This yields a value of 2.03, which becomes the input to the tanh activation function.

  • Plugging 2.03 into the tanh function gives approximately 0.97.
  • The tanh activation maps any input to a value between –1 and 1.
    • When the LSTM input is 1, the tanh output is close to 1.
    • When the input is –10, the tanh output is close to –1.

Thus, based on the short‑term memory and the input, we now have a potential memory of 0.97.

Next, the LSTM decides how much of this potential memory to retain, using the same method as before.

  • The value 4.27 serves as the x‑axis input to the sigmoid activation function.
  • Applying the sigmoid function yields a y‑axis value of approximately 1.0.

This means the entire potential long‑term memory is retained, because multiplying it by 1 does not change it. If the input were –10, the percentage of potential memory to retain would be 0, so nothing would be added to the long‑term memory.

Finally, we add the retained potential memory (0.97) to the existing long‑term memory. This operation constitutes the input gate.

The next article will discuss the final stage of the LSTM.

0 views
Back to Blog

Related posts

Read more »