ML Project Cybersecurity Deep Learning IIoT

IoT NETWORK
INTRUSION
DETECTION

A neural network trained on RT-IoT2022 — 123,117 real-world network flow records from a smart home testbed — to classify traffic into 12 attack and benign categories. Built with TensorFlow / Keras, class-weight balancing, and a rigorous preprocessing pipeline.

123KFlow Records
85Features
12Attack Classes
98.4%Test Accuracy
Live Neural Network — 92 → 16 → 12
Input (92)
Hidden (16)
Output (12)

01 — Dataset

RT-IoT2022
CLASS DISTRIBUTION

Captured from a real smart-home testbed. Severely imbalanced — DOS_SYN_Hping alone accounts for ~77% of records, making naive accuracy a misleading metric.

Sample Count by Attack Type (log scale)
DOS_SYN_Hping94,659 · Attack
Thing_Speak8,108 · Benign
ARP_poisioning7,750 · Attack
MQTT_Publish4,146 · Benign
NMAP_UDP_SCAN2,590 · Attack
NMAP_XMAS_TREE_SCAN2,010 · Attack
NMAP_OS_DETECTION2,000 · Attack
NMAP_TCP_scan1,002 · Attack
DDOS_Slowloris534 · Attack
Wipro_bulb253 · Benign
Metasploit_Brute_Force_SSH37 · Attack
NMAP_FIN_SCAN28 · Attack

02 — Pipeline

PREPROCESSING
PIPELINE

Every step runs in order to produce clean, leak-free training data for the classifier.

Load Data

UCI ML Repository via ucimlrepo. 123,117 rows × 85 cols. No missing values.

Drop Index

Remove the unnamed row-number column saved into the CSV — carries no signal.

One-Hot Encode

pd.get_dummies on proto & service (2 categorical cols → 10 binary columns).

Fixed: called twice

Stratified Split

80/20 train-test, stratified on Attack_type to preserve rare class proportions.

Fixed: duplicate call

Drop Constant Cols

Remove columns with std = 0 before scaling to avoid division-by-zero NaN values.

Fixed: order

Standardise

Fit StandardScaler on train only. Apply to both train and test to prevent leakage.

NaN / Inf Fix

np.nan_to_num replaces any remaining NaN / ±Inf with 0.0 after scaling.

Fixed: before fit

Label Encode

LabelEncoder maps 12 string class names → integers 0–11 for sparse CE loss.

Fixed: dead code

03 — Model Architecture

NEURAL NETWORK
DESIGN

Intentionally lean: one hidden layer with 16 neurons. Simplicity is a feature for a network-flow classifier where the signal is strong.

Input Layer
92 features (after encoding & constant-col removal)
INPUT
Dense — ReLU
16 neurons · (92+1)×16 = 1,488 params
HIDDEN
Dense — Softmax
12 neurons · (16+1)×12 = 204 params
OUTPUT
Total trainable parameters: 1,692
Training Config
OptimizerAdam (lr=0.001)
Losssparse_categorical_crossentropy
Batch Size512
Epochs30
Validation Split20% of train
Class Weights✓ Balanced
Random Seed42
Imbalance Strategy
Methodsklearn compute_class_weight
DOS_SYN_Hping wt~0.08 (majority)
NMAP_FIN_SCAN wt~270 (rarest class)
Stratify split✓ Yes
Metric reportedMacro F1 + per-class

04 — Results

MODEL
PERFORMANCE

Evaluated on the held-out 20% test set. Per-class F1 is the key metric given the severe class imbalance.

98.4% Test Accuracy
Baseline (majority class only) = 76.9%
0.97 Macro F1
Averaged equally across all 12 classes
0.99 Weighted F1
Weighted by class sample counts
Per-Class F1 Score

05 — Verifiability

PUBLIC &
REPRODUCIBLE

Every artefact needed to reproduce or audit this project is openly available.

06 — Observations

TRAINING
INSIGHTS

Key findings from the loss curve behaviour and recommendations for future iterations.

⚖️

Inverted Loss Curve

The training loss being higher than validation loss confirms that the balanced class weights are correctly penalizing mistakes on rare classes. The model is being pushed harder on minority categories during training — exactly as intended.

🔬

Oversampling

To improve precision for the weakest categories, future iterations should explore synthetic oversampling — particularly SMOTE — applied to ultra-rare classes like NMAP_FIN_SCAN (28 samples) and Metasploit_Brute_SSH (37 samples).

🧠

Advanced Architectures

Implementing deeper hidden layers or ensemble methods like XGBoost could yield gains on extreme imbalance cases. Tree-based ensembles often handle skewed distributions more effectively than standard neural networks at this scale.