Update example and notes
This commit is contained in:
45
example/model_design.md
Normal file
45
example/model_design.md
Normal file
@@ -0,0 +1,45 @@
|
||||
# Hybrid Diffusion Design (HAI 21.03)
|
||||
|
||||
## 1) Data representation
|
||||
- Input sequence length: T (e.g., 64 or 128 time steps).
|
||||
- Continuous features: 53 columns (sensor/process values).
|
||||
- Discrete features: 30 columns (binary or low-cardinality states + attack labels).
|
||||
- Time column: `time` is excluded from modeling; use index-based position/time embeddings.
|
||||
|
||||
## 2) Forward processes
|
||||
### Continuous (Gaussian DDPM)
|
||||
- Use cosine beta schedule with `timesteps=1000`.
|
||||
- Forward: `x_t = sqrt(a_bar_t) * x_0 + sqrt(1-a_bar_t) * eps`.
|
||||
|
||||
### Discrete (mask diffusion)
|
||||
- Use `[MASK]` replacement with probability `p(t)`.
|
||||
- Simple schedule: `p(t) = t / T`.
|
||||
- Model predicts original token at masked positions only.
|
||||
|
||||
## 3) Shared backbone + heads
|
||||
- Inputs: concatenated continuous projection + discrete embeddings + time embedding.
|
||||
- Backbone: GRU or temporal transformer.
|
||||
- Heads:
|
||||
- Continuous head predicts noise `eps`.
|
||||
- Discrete heads predict logits per discrete feature.
|
||||
|
||||
## 4) Loss
|
||||
- Continuous: `L_cont = MSE(eps_pred, eps)`.
|
||||
- Discrete: `L_disc = CE(logits, target)` on masked positions only.
|
||||
- Combined: `L = lambda * L_cont + (1 - lambda) * L_disc`.
|
||||
|
||||
## 5) Training loop (high level)
|
||||
1. Load a batch of sequences.
|
||||
2. Sample timesteps `t`.
|
||||
3. Apply `q_sample_continuous` and `q_sample_discrete`.
|
||||
4. Forward model, compute losses.
|
||||
5. Backprop + optimizer step.
|
||||
|
||||
## 6) Sampling (high level)
|
||||
- Continuous: standard reverse diffusion from pure noise.
|
||||
- Discrete: start from all `[MASK]` and iteratively refine tokens.
|
||||
|
||||
## 7) Files in this example
|
||||
- `feature_split.json`: column split for HAI 21.03.
|
||||
- `hybrid_diffusion.py`: model + diffusion utilities.
|
||||
- `train_stub.py`: end-to-end scaffold for loss computation.
|
||||
Reference in New Issue
Block a user