This commit is contained in:
2026-01-22 21:17:11 +08:00
parent 5a109f91ac
commit 178fb7441c
4 changed files with 102 additions and 12 deletions

View File

@@ -66,6 +66,8 @@ python example/run_pipeline.py --device auto
- Continuous sampling is clipped in normalized space each step for stability.
- Optional conditioning by file id (`train*.csv.gz`) is enabled by default for multi-file training.
- Continuous head can be bounded with `tanh` via `use_tanh_eps` in config.
- Export now clamps continuous features to training min/max and preserves integer/decimal precision.
- `<UNK>` tokens are replaced by the most frequent token for each discrete column at export.
- The script only samples the first 5000 rows to stay fast.
- `prepare_data.py` runs without PyTorch, but `train.py` and `sample.py` require it.
- `train.py` and `sample.py` auto-select GPU if available; otherwise they fall back to CPU.