Step 2: Activation Functions

ReLU, sigmoid, and softmax

1 ExplorePlay below
2 ReadUnderstand
3 BuildHands-on lab
4 CompareSolution
💡 ReflectThink deeper

Activation functions

The activation function decides whether a neuron "fires." Without it, stacking layers would just be one big linear equation — no better than logistic regression.

When to use which?

FunctionOutput rangeUse forLayer
ReLU[0, +inf)General purposeHidden layers
Sigmoid[0, 1]Binary classificationOutput (1 neuron)
Softmax[0, 1], sum=1Multi-classOutput (N neurons)
Loading...
Loading...
Loading...

Think Deeper

ReLU outputs 0 for all negative inputs. What happens if a neuron's weights cause it to always receive negative values?

It's permanently 'dead' — always outputs 0, gradient is 0, so weights never update. This is the dying ReLU problem. Solutions: LeakyReLU (small slope for negatives) or careful initialisation.
Cybersecurity tie-in: For a binary intrusion detector (attack/benign), use sigmoid output — it gives you P(attack). For classifying traffic into categories (malware, C2, scan, benign), use softmax with one output per class.

Loading...