Parse the hidden numbers
The duration_str column looks like "2.34s" — a string with a unit suffix.
Strip the "s", convert to float, then derive bytes_per_second.
Raw
—
→ strip "s" →
Duration (seconds)
—
→
Bytes sent
—
÷
Bytes/second
—
More derived features
| Feature | Formula | Value |
|---|---|---|
| packet_rate | packets / duration | — |
| bytes_ratio | sent / (recv + 1) | — |
Top 5 suspicious (by bytes/sec)
| # | Bytes/sec | Bytes sent | Duration |
|---|---|---|---|
| 1 | 138800.0 | 6940 | 0.05s |
| 2 | 92380.0 | 4619 | 0.05s |
| 3 | 88860.0 | 4443 | 0.05s |
| 4 | 17862.2 | 8038 | 0.45s |
| 5 | 6967.7 | 17489 | 2.51s |
Loading...
Loading...
Loading...
Think Deeper
Try this:
A connection sends 50,000 bytes in 0.5 seconds. Another sends 50,000 bytes in 50 seconds. Same bytes_sent — but which is suspicious?
The first one: 100,000 bytes/sec is a rapid data transfer, possibly exfiltration. The second is 1,000 bytes/sec — normal browsing. Raw bytes_sent can't distinguish them, but bytes_per_second immediately flags the anomaly.
Cybersecurity tie-in: A 50,000-byte transfer in 0.5 seconds is a red flag (possible data exfiltration).
The same 50,000 bytes over 50 seconds is normal browsing. bytes_per_second captures the signal that
raw bytes_sent alone misses.