Feature Importance
After training, model.feature_importances_ tells you how much each feature contributed to the tree's decisions.
Each value is the total information gain that feature contributed across all splits, normalised so they sum to 1.0:
importances = model.feature_importances_
for name, imp in sorted(zip(FEATURES, importances),
key=lambda x: x[1], reverse=True):
bar = '█' * int(imp * 40)
print(f" {name:25s} {bar} {imp:.3f}")
A feature near the root (where it improves the split most) accumulates more importance.
Why These Features Matter for Network Security
| Feature | High value suggests |
|---|---|
connection_rate | Port scanning or brute-force attack |
bytes_sent | Data exfiltration (large outbound transfer) |
unique_dest_ports | Port scanning (probing many services) |
duration_seconds | Low-and-slow attacks vs quick scans |
failed_conns | Brute-force or malformed exploit attempts |
Visualising Feature Importance
import matplotlib.pyplot as plt
# Sort features by importance
sorted_idx = importances.argsort()[::-1]
plt.figure(figsize=(10, 5))
plt.barh(range(len(FEATURES)),
importances[sorted_idx],
align='center')
plt.yticks(range(len(FEATURES)),
[FEATURES[i] for i in sorted_idx])
plt.xlabel('Feature Importance (MDI)')
plt.title('Which features drive predictions?')
plt.gca().invert_yaxis()
plt.tight_layout()
plt.show()
What Importance Does NOT Tell You
- Whether the relationship is positive or negative
- Whether the effect is linear or non-linear
- Whether the feature would be important in a different model type
- Correlated features split importance between them — each looks less important individually
Loading...
Loading...
Loading...
Think Deeper
Try this:
If you remove the top feature (connection_rate) and retrain, what happens to accuracy? Does the second feature become more important?
Accuracy drops because you removed the most discriminative signal. The second feature (
bytes_sent) absorbs some of the lost signal and its importance score increases. This reveals feature redundancy — correlated features can partially substitute for each other.
Cybersecurity tie-in: Feature importance guides feature selection.
If
bytes_received has near-zero importance, you might drop it to simplify the model
and reduce the data you need to collect. In production, less data collection = lower storage and processing cost.