Update docs with latest architecture and results

This commit is contained in:
2026-01-28 11:23:50 +08:00
parent e2974342d5
commit 34d6f0d808
5 changed files with 37 additions and 4 deletions

View File

@@ -16,3 +16,8 @@ Tools:
- `example/run_all_full.py` for one-command full pipeline + diagnostics.
Notes:
- If `use_quantile_transform` is enabled, run `prepare_data.py` with `full_stats: true` to build quantile tables.
Current status (high level):
- Two-stage pipeline (GRU trend + diffusion residuals).
- Quantile transform + post-hoc calibration enabled for continuous features.
- Latest metrics (2026-01-27 21:22): avg_ks ~0.405 / avg_jsd ~0.038 / avg_lag1_diff ~0.145.

View File

@@ -71,3 +71,10 @@
- `example/export_samples.py`
- `example/prepare_data.py`
- `example/config.json`
## 2026-01-27 — Full quantile stats in preparation
- **Decision**: Enable full statistics when quantile transform is active.
- **Why**: Stabilize quantile tables and reduce CDF mismatch.
- **Files**:
- `example/prepare_data.py`
- `example/config.json`

View File

@@ -27,3 +27,8 @@ YYYY-MM-DD
- Config: `example/config.json` (two-stage residual diffusion; user run on Windows)
- Result: 0.7096230 / 0.0331810 / 0.1898416
- Notes: slight KS improvement, lag-1 improves; still distribution/temporal trade-off.
## 2026-01-27
- Config: `example/config.json` (quantile transform + calibration, full stats)
- Result: 0.4046 / 0.0376 / 0.1449
- Notes: KS and lag-1 improved significantly; JSD regressed vs best discrete run.

View File

@@ -11,3 +11,6 @@
## Two-stage training with curriculum
- Hypothesis: train diffusion on residuals only after temporal GRU converges to low error.
## Discrete calibration
- Hypothesis: post-hoc calibration on discrete marginals can reduce JSD without harming KS.

View File

@@ -34,6 +34,10 @@ One command pipeline:
```
python example/run_all.py --device cuda
```
Full pipeline + diagnostics:
```
python example/run_all_full.py --device cuda
```
Pipeline stages:
1) **Prepare data** (`example/prepare_data.py`)
@@ -63,6 +67,11 @@ Defined in `example/hybrid_diffusion.py`.
- Transformer encoder (selfattention)
- Post LayerNorm + residual MLP
**Current default config (latest):**
- Backbone: Transformer
- Sequence length: 96
- Batch size: 16
**Outputs:**
- Continuous head: predicts target (`eps` or `x0`)
- Discrete heads: logits per discrete column
@@ -83,6 +92,8 @@ trend = GRU(x)
residual = x - trend
```
**Two-stage training:** temporal GRU first, diffusion on residuals.
---
## 5. Diffusion Formulations / 扩散形式
@@ -145,6 +156,7 @@ Key steps:
- Streaming mean/std/min/max + int-like detection
- Optional **log1p transform** for heavy-tailed continuous columns
- Optional **quantile transform** (TabDDPM-style) for continuous columns (skips extra standardization)
- **Full quantile stats** (full_stats) for stable calibration
- Optional **post-hoc quantile calibration** to align 1D CDFs after sampling
- Discrete vocab + most frequent token
- Windowed batching with **shuffle buffer**
@@ -189,8 +201,8 @@ Metrics (with reference):
- 输出 `example/results/cdf_<feature>.svg`(真实 vs 生成 CDF
- 统计生成数据是否堆积在边界gen_frac_at_min / gen_frac_at_max
Recent run (user-reported, Windows):
- avg_ks 0.7096 / avg_jsd 0.03318 / avg_lag1_diff 0.18984
Recent runs (Windows):
- 2026-01-27 21:22:34 — avg_ks 0.4046 / avg_jsd 0.0376 / avg_lag1_diff 0.1449
---
@@ -226,9 +238,9 @@ Recent run (user-reported, Windows):
---
## 13. Known Issues / Current Limitations / 已知问题
- KS may remain high → continuous distribution mismatch
- KS can remain high on a subset of features → per-feature diagnosis required
- Lag1 may fluctuate → distribution vs temporal trade-off
- Continuous loss may dominate → needs careful weighting
- Discrete JSD can regress when continuous KS is prioritized
- Transformer backbone may change stability; needs systematic comparison
---
@@ -237,6 +249,7 @@ Recent run (user-reported, Windows):
- Compare GRU vs Transformer backbone using `run_compare.py`
- Explore **vprediction** for continuous branch
- Strengthen discrete diffusion (e.g., D3PM-style transitions)
- Add targeted discrete calibration for highJSD columns
---