The Alchemist.
HomeAI EngineeringSubstack
SubscribeLogin
Deep Learning
Introduction
Perceptron Ann
Activations
Loss Functions
Optimizers
Regularization
Cnn
Lstm
Advanced
  1. Home
  2. Blog
  3. Deep Learning
  4. Activations
Back to Deep Learning

~/blog/tutorials/deep-learning

Activations

Tutorial
Jul 1, 202610 min read
0

Vanishing Gradient Problem

You train a 5-layer sigmoid network on a classification task and watch the loss barely move for the first hundred epochs. The model isn't broken — gradient desc…

Tutorial
Jul 1, 202610 min read
0

Sigmoid Activation Function

A neuron computes a weighted sum z = w·x + b. That number can be anything: −1000, 0, 47.3. But for a binary classification output — "will this loan default?" —…

Tutorial
Jul 1, 20268 min read
0

Tanh Activation Function

Sigmoid solves one problem — mapping z to a probability — but introduces another: every output is positive, which forces all upstream weight gradients to update…

Tutorial
Jul 1, 202610 min read
0

ReLU Activation Function

Sigmoid and tanh both saturate — for large |z|, their derivatives collapse toward zero and gradients die. ReLU sidesteps this entirely for positive values: the…

Tutorial
Jul 1, 20269 min read
0

Leaky ReLU and Parametric ReLU

ReLU kills neurons. When z ≤ 0 for every input a neuron encounters, its output is zero and its gradient is zero — the weight never moves again. The fix is simpl…

© 2026 Mohammed Vasim. Built with curiosity.