Activation functions
The activation function decides whether a neuron "fires." Without it, stacking layers would just be one big linear equation — no better than logistic regression.
When to use which?
| Function | Output range | Use for | Layer |
|---|---|---|---|
| ReLU | [0, +inf) | General purpose | Hidden layers |
| Sigmoid | [0, 1] | Binary classification | Output (1 neuron) |
| Softmax | [0, 1], sum=1 | Multi-class | Output (N neurons) |
Loading...
Loading...
Loading...
Think Deeper
Try this:
ReLU outputs 0 for all negative inputs. What happens if a neuron's weights cause it to always receive negative values?
It's permanently 'dead' — always outputs 0, gradient is 0, so weights never update. This is the dying ReLU problem. Solutions: LeakyReLU (small slope for negatives) or careful initialisation.
Cybersecurity tie-in: For a binary intrusion detector (attack/benign), use sigmoid
output — it gives you P(attack). For classifying traffic into categories (malware, C2, scan, benign),
use softmax with one output per class.