1.7 KiB
1.7 KiB
Hybrid Diffusion Design (HAI 21.03)
1) Data representation
- Input sequence length: T (e.g., 64 or 128 time steps).
- Continuous features: 53 columns (sensor/process values).
- Discrete features: 30 columns (binary or low-cardinality states + attack labels).
- Time column:
timeis excluded from modeling; use index-based position/time embeddings.
2) Forward processes
Continuous (Gaussian DDPM)
- Use cosine beta schedule with
timesteps=1000. - Forward:
x_t = sqrt(a_bar_t) * x_0 + sqrt(1-a_bar_t) * eps.
Discrete (mask diffusion)
- Use
[MASK]replacement with probabilityp(t). - Simple schedule:
p(t) = t / T. - Model predicts original token at masked positions only.
3) Shared backbone + heads
- Inputs: concatenated continuous projection + discrete embeddings + time embedding.
- Backbone: GRU or temporal transformer.
- Heads:
- Continuous head predicts noise
eps. - Discrete heads predict logits per discrete feature.
- Continuous head predicts noise
4) Loss
- Continuous:
L_cont = MSE(eps_pred, eps). - Discrete:
L_disc = CE(logits, target)on masked positions only. - Combined:
L = lambda * L_cont + (1 - lambda) * L_disc.
5) Training loop (high level)
- Load a batch of sequences.
- Sample timesteps
t. - Apply
q_sample_continuousandq_sample_discrete. - Forward model, compute losses.
- Backprop + optimizer step.
6) Sampling (high level)
- Continuous: standard reverse diffusion from pure noise.
- Discrete: start from all
[MASK]and iteratively refine tokens.
7) Files in this example
feature_split.json: column split for HAI 21.03.hybrid_diffusion.py: model + diffusion utilities.train_stub.py: end-to-end scaffold for loss computation.