Understanding LSTMs – Part 6: How LSTM Produces Its Final Output

Published: 1 day ago (March 1, 2026 at 03:50 PM EST)

2 min read

Source: Dev.to

Final Stage: Updating the Short‑Term Memory

In the previous article we went through the input gate; in this article we will explore the next component.

This final stage updates the short‑term memory.
We begin with the new long‑term memory and use it as the input to the tanh activation function. After plugging in 2.96 into the tanh function, we obtain ≈ 0.99. The value 0.99 represents a potential short‑term memory.

Deciding How Much to Output

Now the LSTM must determine how much of this potential short‑term memory to pass forward. As in the previous stages, we use the sigmoid activation function to decide what percentage the LSTM keeps. After performing the calculation, we obtain 0.99.

Creating the New Short‑Term Memory

We now multiply the two values:

[ \text{new short‑term memory} = 0.99 \times 0.99 \approx 0.98 ]

This produces the new short‑term memory: 0.98. This value, 0.98, is also the final output of the entire LSTM unit. Because the new short‑term memory is also the output of the LSTM unit, this stage is called the output gate.

Now that we understand how the three stages of an LSTM work, we will see them in action with real data in the next article.

Understanding LSTMs – Part 6: How LSTM Produces Its Final Output

Final Stage: Updating the Short‑Term Memory

Deciding How Much to Output

Creating the New Short‑Term Memory

Related posts

Building a Minimal Transformer for 10-digit Addition

Predicting Traffic in the City of Buffalo Using a Neural Network

The Machine Learning Lessons I’ve Learned This Month

[Paper] Partial Causal Structure Learning for Valid Selective Conformal Inference under Interventions