Add filtered KS diagnostics and feature-type plan

This commit is contained in:
2026-01-28 13:46:36 +08:00
parent 34d6f0d808
commit a1ff64aa40
6 changed files with 109 additions and 0 deletions

View File

@@ -14,6 +14,7 @@ Conventions:
Tools:
- `example/diagnose_ks.py` for per-feature KS + CDF plots.
- `example/run_all_full.py` for one-command full pipeline + diagnostics.
- `example/filtered_metrics.py` for filtered KS after removing collapsed/outlier features.
Notes:
- If `use_quantile_transform` is enabled, run `prepare_data.py` with `full_stats: true` to build quantile tables.

View File

@@ -78,3 +78,10 @@
- **Files**:
- `example/prepare_data.py`
- `example/config.json`
## 2026-01-27 — Filtered KS for diagnostics
- **Decision**: Add a filtered KS metric that excludes collapsed/outlier features.
- **Why**: Avoid a handful of features dominating the aggregate KS while still reporting full KS.
- **Files**:
- `example/filtered_metrics.py`
- `example/run_all_full.py`

View File

@@ -14,3 +14,6 @@
## Discrete calibration
- Hypothesis: post-hoc calibration on discrete marginals can reduce JSD without harming KS.
## Feature-type split modeling
- Hypothesis: separate generation per feature type (setpoints, controllers, actuators, quantized, derived, aux) yields better overall fidelity.