From ddc06aa8a7b20b2de567e6b98c64f6e69796a2bc Mon Sep 17 00:00:00 2001
From: MZ YANG <123012548+HOWARD-mzYANG@users.noreply.github.com>
Date: Mon, 29 Dec 2025 12:01:11 +0800
Subject: [PATCH] docs: update ML/DL pre-requirements

---
 knowledges/pre_requirements.md | 140 ++++++++++++++++++++++++++++++++-
 1 file changed, 137 insertions(+), 3 deletions(-)

diff --git a/knowledges/pre_requirements.md b/knowledges/pre_requirements.md
index 8136bbb..142e092 100644
--- a/knowledges/pre_requirements.md
+++ b/knowledges/pre_requirements.md
@@ -44,8 +44,142 @@ Optional: low level language, experience on **network programming**
 
 ## ML/DL
 
-> WIP(Work in Process)
->   
-> To be finished by Mingzhe Yang and Hongyu Yan
+## ML / DL
+
+> This section focuses on **practical, engineering-oriented understanding and usage** of ML/DL,  
+> rather than theoretical derivation or cutting-edge research.
+>
+> You are **NOT required** to be an ML researcher,  
+> but you should be able to **read code, run experiments, and explain model behaviors**.
+
+---
+
+### 1. Foundations
+
+> Assumed background: completion of a **Year 3 “AI Foundations” (or equivalent)** course.  
+> If you have not formally taken such a course, **understanding the core concepts is sufficient**.  
+> No in-depth theoretical derivations are required.
+
+You should know:
+
+- The difference between **Machine Learning (ML)** and **Deep Learning (DL)**
+- Learning paradigms:
+  - Supervised / Unsupervised / Self-supervised learning
+- Basic concepts:
+  - Dataset / Batch / Epoch
+  - Loss function
+  - Optimizer (eg. SGD, Adam)
+  - Overfitting / Underfitting
+- The difference between training and inference
+
+You should be able to:
+
+- Train a basic model using common frameworks (eg. `PyTorch`, `TensorFlow`)
+- Understand and implement a standard training loop:
+  - Forward pass → loss computation → backward pass → parameter update
+
+---
+
+### 2. Neural Network Basics
+
+You should know:
+
+- Common network layers:
+  - Linear / Fully Connected layers
+  - Convolution layers
+  - Normalization layers (BatchNorm / LayerNorm)
+- Common activation functions:
+  - ReLU / LeakyReLU / Sigmoid / Tanh
+- The **conceptual role** of backpropagation (no formula derivation required)
+
+You should understand:
+
+- How data flows through a neural network
+- How gradients affect parameter updates
+- Why deeper networks are harder to train
+
+---
+
+### 3. Generative Models (Overview Level)
+
+You should take a glance at the following models and understand their **core ideas and behavioral characteristics**.
+
+#### GAN (Generative Adversarial Network)
+
+You should know:
+
+- The roles of the Generator and the Discriminator
+- The adversarial training process
+- What the loss functions roughly represent
+
+You should understand:
+
+- Why GAN training can be unstable
+- What *mode collapse* means
+- Typical use cases (eg. image generation, data augmentation)
+
+Optional but recommended:
+
+- Run or read code of a simple GAN implementation (eg. DCGAN)
+
+---
+
+#### Diffusion Models
+
+You should know:
+
+- The forward process: gradually adding noise to data
+- The reverse process: denoising and sampling
+- Why diffusion models can generate high-quality samples
+
+You should understand:
+
+- Differences between diffusion models and GANs in training stability
+- Why diffusion sampling is usually slower
+- High-level ideas of noise prediction vs data prediction
+
+Optional but recommended:
+
+- Run inference using a pretrained diffusion model
+- Understand the role of timestep / scheduler
+
+---
+
+### 4. Engineering Perspective
+
+You should be familiar with:
+
+- Differences between GPU and CPU training / inference
+- Basic memory and performance considerations
+- Model checkpoint loading and saving
+- Reproducibility basics (random seed, configuration, logging)
+
+You should be able to:
+
+- Read and modify existing ML/DL codebases
+- Debug common issues:
+  - NaN loss
+  - No convergence
+  - OOM (out-of-memory)
+- Integrate ML/DL components into a larger system (eg. networked services, data pipelines)
+
+---
+
+### 5. Relation to This Project
+
+You should understand:
+
+- ML/DL models are treated as **modules**, not black boxes
+- Model outputs should be **interpretable or observable** when possible
+- ML components may interact with:
+  - Network traffic
+  - Logs / metrics
+  - Online or streaming data
+
+You are expected to:
+
+- Use ML/DL as a **tool**, not an end goal
+- Be comfortable combining ML logic with system / network code
+
 
 Take a glance on `GAN` and `Diffusion`. 
-- 
2.49.1