Fix quantile transform scaling and document
This commit is contained in:
@@ -144,7 +144,7 @@ Defined in `example/data_utils.py` + `example/prepare_data.py`.
|
||||
Key steps:
|
||||
- Streaming mean/std/min/max + int-like detection
|
||||
- Optional **log1p transform** for heavy-tailed continuous columns
|
||||
- Optional **quantile transform** (TabDDPM-style) for continuous columns
|
||||
- Optional **quantile transform** (TabDDPM-style) for continuous columns (skips extra standardization)
|
||||
- Discrete vocab + most frequent token
|
||||
- Windowed batching with **shuffle buffer**
|
||||
|
||||
@@ -160,7 +160,7 @@ Export process:
|
||||
- Diffusion generates residuals
|
||||
- Output: `trend + residual`
|
||||
- De-normalize continuous values
|
||||
- Inverse quantile transform (if enabled)
|
||||
- Inverse quantile transform (if enabled; no extra de-standardization)
|
||||
- Bound to observed min/max (clamp or sigmoid mapping)
|
||||
- Restore discrete tokens from vocab
|
||||
- Write to CSV
|
||||
|
||||
Reference in New Issue
Block a user