~/blog/tutorials/deep-learning

Lstm

Jul 3, 20269 min read

Why LSTM RNN?

A plain recurrent neural network carries information forward one hidden state at a time — each step blends the current input with whatever the previous step rem…

Tutorial

Jul 3, 20268 min read

LSTM Architecture

The previous post established why a vanilla RNN's single hidden state breaks down over long sequences — gradients shrink multiplicatively at every timestep. LST…

Tutorial

Jul 3, 20268 min read

Forget Gate

The cell state carries information forward across timesteps, but not everything that was relevant a moment ago stays relevant. A language model tracking "she ha…

Tutorial

Jul 3, 20267 min read

Input Gate & Candidate Memory

The forget gate decides what survives from the cell state's past. It never adds anything new. Once "He" has erased the gender dimension, the cell state needs fr…

Tutorial

Jul 3, 20266 min read

Output Gate

The cell state now holds everything the LSTM has decided is worth remembering — a mix of long-term signal built up across forget and input gates. But not all of…