Add filtered KS diagnostics and feature-type plan

2026-01-28 13:46:36 +08:00
parent 34d6f0d808
commit a1ff64aa40
6 changed files with 109 additions and 0 deletions
--- a/docs/README.md
+++ b/docs/README.md
@@ -14,6 +14,7 @@ Conventions:
 Tools:
 - `example/diagnose_ks.py` for per-feature KS + CDF plots.
 - `example/run_all_full.py` for one-command full pipeline + diagnostics.
+ - `example/filtered_metrics.py` for filtered KS after removing collapsed/outlier features.
 Notes:
 - If `use_quantile_transform` is enabled, run `prepare_data.py` with `full_stats: true` to build quantile tables.

--- a/docs/decisions.md
+++ b/docs/decisions.md
@@ -78,3 +78,10 @@
 - **Files**:
  - `example/prepare_data.py`
  - `example/config.json`
+
+## 2026-01-27 — Filtered KS for diagnostics
+- **Decision**: Add a filtered KS metric that excludes collapsed/outlier features.
+- **Why**: Avoid a handful of features dominating the aggregate KS while still reporting full KS.
+- **Files**:
+  - `example/filtered_metrics.py`
+  - `example/run_all_full.py`
--- a/docs/ideas.md
+++ b/docs/ideas.md
@@ -14,3 +14,6 @@

 ## Discrete calibration
 - Hypothesis: post-hoc calibration on discrete marginals can reduce JSD without harming KS.
+
+## Feature-type split modeling
+- Hypothesis: separate generation per feature type (setpoints, controllers, actuators, quantized, derived, aux) yields better overall fidelity.