Add filtered KS diagnostics and feature-type plan

This commit is contained in:
2026-01-28 13:46:36 +08:00
parent 34d6f0d808
commit a1ff64aa40
6 changed files with 109 additions and 0 deletions

View File

@@ -94,6 +94,33 @@ residual = x - trend
**Two-stage training:** temporal GRU first, diffusion on residuals.
### 4.3 Feature-Type Aware Strategy / 特征类型分治方案
Based on HAI feature semantics and observed KS outliers, we classify problematic features into six types and plan separate modeling paths:
1) **Type 1: Exogenous setpoints / demands** (schedule-driven, piecewise-constant)
Examples: P1_B4002, P2_MSD, P4_HT_LD
Strategy: program generator (HSMM / change-point), or sample from program library; condition diffusion on these.
2) **Type 2: Controller outputs** (policy-like, saturation / rate limits)
Example: P1_B4005
Strategy: small controller emulator (PID/NARX) with clamp + rate-limit.
3) **Type 3: Spiky actuators** (few operating points + long dwell)
Examples: P1_PCV02Z, P1_FCV02Z
Strategy: spike-and-slab + dwell-time modeling or commanddriven actuator dynamics.
4) **Type 4: Quantized / digital-as-continuous**
Examples: P4_ST_PT01, P4_ST_TT01
Strategy: generate latent continuous then quantize or treat as ordinal discrete diffusion.
5) **Type 5: Derived conversions**
Examples: *FT**FTZ*
Strategy: generate base variable and derive conversions deterministically.
6) **Type 6: Aux / vibration / narrow-band**
Examples: P2_24Vdc, P2_HILout
Strategy: AR/ARMA or regimeconditioned narrow-band models.
---
## 5. Diffusion Formulations / 扩散形式
@@ -201,6 +228,11 @@ Metrics (with reference):
- 输出 `example/results/cdf_<feature>.svg`(真实 vs 生成 CDF
- 统计生成数据是否堆积在边界gen_frac_at_min / gen_frac_at_max
**Filtered KS剔除难以学习特征仅用于诊断** `example/filtered_metrics.py`
- 规则std 过小或 KS 过高自动剔除
- 输出 `example/results/filtered_metrics.json`
- 只用于诊断,不作为最终指标
Recent runs (Windows):
- 2026-01-27 21:22:34 — avg_ks 0.4046 / avg_jsd 0.0376 / avg_lag1_diff 0.1449