46 lines
1.7 KiB
Markdown
46 lines
1.7 KiB
Markdown
# Hybrid Diffusion Design (HAI 21.03)
|
|
|
|
## 1) Data representation
|
|
- Input sequence length: T (e.g., 64 or 128 time steps).
|
|
- Continuous features: 53 columns (sensor/process values).
|
|
- Discrete features: 30 columns (binary or low-cardinality states + attack labels).
|
|
- Time column: `time` is excluded from modeling; use index-based position/time embeddings.
|
|
|
|
## 2) Forward processes
|
|
### Continuous (Gaussian DDPM)
|
|
- Use cosine beta schedule with `timesteps=1000`.
|
|
- Forward: `x_t = sqrt(a_bar_t) * x_0 + sqrt(1-a_bar_t) * eps`.
|
|
|
|
### Discrete (mask diffusion)
|
|
- Use `[MASK]` replacement with probability `p(t)`.
|
|
- Simple schedule: `p(t) = t / T`.
|
|
- Model predicts original token at masked positions only.
|
|
|
|
## 3) Shared backbone + heads
|
|
- Inputs: concatenated continuous projection + discrete embeddings + time embedding.
|
|
- Backbone: GRU or temporal transformer.
|
|
- Heads:
|
|
- Continuous head predicts noise `eps`.
|
|
- Discrete heads predict logits per discrete feature.
|
|
|
|
## 4) Loss
|
|
- Continuous: `L_cont = MSE(eps_pred, eps)`.
|
|
- Discrete: `L_disc = CE(logits, target)` on masked positions only.
|
|
- Combined: `L = lambda * L_cont + (1 - lambda) * L_disc`.
|
|
|
|
## 5) Training loop (high level)
|
|
1. Load a batch of sequences.
|
|
2. Sample timesteps `t`.
|
|
3. Apply `q_sample_continuous` and `q_sample_discrete`.
|
|
4. Forward model, compute losses.
|
|
5. Backprop + optimizer step.
|
|
|
|
## 6) Sampling (high level)
|
|
- Continuous: standard reverse diffusion from pure noise.
|
|
- Discrete: start from all `[MASK]` and iteratively refine tokens.
|
|
|
|
## 7) Files in this example
|
|
- `feature_split.json`: column split for HAI 21.03.
|
|
- `hybrid_diffusion.py`: model + diffusion utilities.
|
|
- `train_stub.py`: end-to-end scaffold for loss computation.
|