transformer
This commit is contained in:
35
docs/decisions.md
Normal file
35
docs/decisions.md
Normal file
@@ -0,0 +1,35 @@
|
||||
# Design & Decision Log
|
||||
|
||||
## 2026-01-26 — Two-stage temporal backbone (GRU) + residual diffusion
|
||||
- **Decision**: Add a stage-1 GRU trend model, then train diffusion on residuals.
|
||||
- **Why**: Separate temporal consistency from distribution alignment.
|
||||
- **Files**:
|
||||
- `example/hybrid_diffusion.py` (added `TemporalGRUGenerator`)
|
||||
- `example/train.py` (two-stage training + residual diffusion)
|
||||
- `example/sample.py`, `example/export_samples.py` (trend + residual synthesis)
|
||||
- `example/config.json` (temporal hyperparameters)
|
||||
- **Expected effect**: improve lag-1 consistency; may hurt KS if residual distribution drifts.
|
||||
|
||||
## 2026-01-26 — Residual distribution alignment losses
|
||||
- **Decision**: Apply distribution losses to residuals (not raw x0).
|
||||
- **Why**: Diffusion models residuals; alignment should match residual distribution.
|
||||
- **Files**:
|
||||
- `example/train.py` (quantile loss on residuals)
|
||||
- `example/config.json` (quantile weight)
|
||||
|
||||
## 2026-01-26 — SNR-weighted loss + residual stats
|
||||
- **Decision**: Add SNR-weighted loss and residual mean/std regularization.
|
||||
- **Why**: Stabilize diffusion training and improve KS.
|
||||
- **Files**:
|
||||
- `example/train.py`
|
||||
- `example/config.json`
|
||||
|
||||
## 2026-01-26 — Switchable backbone (GRU vs Transformer)
|
||||
- **Decision**: Make the diffusion backbone configurable (`backbone_type`) with a Transformer encoder option.
|
||||
- **Why**: Test whether self‑attention reduces temporal vs distribution competition without altering the two‑stage design.
|
||||
- **Files**:
|
||||
- `example/hybrid_diffusion.py`
|
||||
- `example/train.py`
|
||||
- `example/sample.py`
|
||||
- `example/export_samples.py`
|
||||
- `example/config.json`
|
||||
Reference in New Issue
Block a user