Update example and notes

This commit is contained in:
2026-01-09 02:14:20 +08:00
parent 200bdf6136
commit c0639386be
18 changed files with 31656 additions and 0 deletions

45
example/model_design.md Normal file
View File

@@ -0,0 +1,45 @@
# Hybrid Diffusion Design (HAI 21.03)
## 1) Data representation
- Input sequence length: T (e.g., 64 or 128 time steps).
- Continuous features: 53 columns (sensor/process values).
- Discrete features: 30 columns (binary or low-cardinality states + attack labels).
- Time column: `time` is excluded from modeling; use index-based position/time embeddings.
## 2) Forward processes
### Continuous (Gaussian DDPM)
- Use cosine beta schedule with `timesteps=1000`.
- Forward: `x_t = sqrt(a_bar_t) * x_0 + sqrt(1-a_bar_t) * eps`.
### Discrete (mask diffusion)
- Use `[MASK]` replacement with probability `p(t)`.
- Simple schedule: `p(t) = t / T`.
- Model predicts original token at masked positions only.
## 3) Shared backbone + heads
- Inputs: concatenated continuous projection + discrete embeddings + time embedding.
- Backbone: GRU or temporal transformer.
- Heads:
- Continuous head predicts noise `eps`.
- Discrete heads predict logits per discrete feature.
## 4) Loss
- Continuous: `L_cont = MSE(eps_pred, eps)`.
- Discrete: `L_disc = CE(logits, target)` on masked positions only.
- Combined: `L = lambda * L_cont + (1 - lambda) * L_disc`.
## 5) Training loop (high level)
1. Load a batch of sequences.
2. Sample timesteps `t`.
3. Apply `q_sample_continuous` and `q_sample_discrete`.
4. Forward model, compute losses.
5. Backprop + optimizer step.
## 6) Sampling (high level)
- Continuous: standard reverse diffusion from pure noise.
- Discrete: start from all `[MASK]` and iteratively refine tokens.
## 7) Files in this example
- `feature_split.json`: column split for HAI 21.03.
- `hybrid_diffusion.py`: model + diffusion utilities.
- `train_stub.py`: end-to-end scaffold for loss computation.