[Suggestion]: 分别处理ks troublemaker features #2

New Issue

manbo · 2026-01-28T13:29:31+08:00

manbo commented

2026-01-28 13:29:31 +08:00

根据目前的架构，我们可以根据eval.json过滤出难以学习的features。这些主要是feature本身的含义和diffusion模式有冲突。DPPM貌似> 学不到event driven / regime driven的东西（虽然这些是discrete的，但是我很奇怪目前架构怎么处理discrete,只有masked吗？）

刚刚我让gpt对features做的分类，参考了hai pdf上面的标注。一共分成6类。我建议对每一类单独规划解决方案。gpt 原回答：

I pulled the post-Jan-27 eval snapshot you’re referring to (the one where you logged KS ≈ 0.4046).
In that eval, a small set of features are still extreme outliers in continuous_ks (e.g., P1_B4002 = 1.0, P2_MSD = 1.0, P1_PCV02Z ≈ 0.992, P1_B4005 ≈ 0.797, P4_HT_LD ≈ 0.717).
Two of them are outright collapsed to constants in the generated data (std = 0.0): P1_B4002 and P2_MSD.
Those alone can keep KS stubbornly high.

Below is a practical classification of the “troublemakers” (and a small watchlist), plus what to model them with instead.

Type 1 — Exogenous setpoints & demands (schedule-driven, piecewise-constant)

Features (high KS / watchlist)

P1_B4002 (KS=1.0) — High water level setpoint
P2_MSD (KS=1.0) — corresponds to Manual speed demand in the dataset docs
P4_HT_LD (KS≈0.717) — Load demand of HTM
P1_B2004 (KS≈0.695)
P1_B3004 (KS≈0.662)
P1_B3005 (KS≈0.564)
P1_B4022 (KS≈0.619)

Why they break diffusion KS
These are not “plant responses”; they’re operator/program inputs. In HAI, setpoint-change commands are delivered at the start of each day (i.e., regime/schedule-like). Diffusion often either (a) blurs steps into ramps, or (b) collapses to a mean (your std=0 cases).

Better methods/architectures

HSMM / change-point program generator (recommended)
- Model them as piecewise-constant segments: sample {num_changes, change_times, step_sizes} + optional ramps.
- HSMM/explicit duration modeling beats Transformer/GRU here because it matches “dwell time” statistics directly.
Conditional “control-program first” hierarchy
- Generate (setpoints, demands, mode flags) first → then condition your diffusion on them to generate the process measurements.
- This keeps the joint distribution sane while letting you use the right tool per variable.
Empirical program library (fast baseline that often wins KS)
- Build a library of real daily SP/demand programs; during generation, sample one (or splice segments) and add small noise where appropriate.
- Because these are exogenous, this is often acceptable and very hard to beat on KS.

Type 2 — Controller outputs (policy-like, saturation/rate-limits)

Features

P1_B4005 (KS≈0.797) — Temperature PID control output

Why it’s hard
Controller outputs are a function of (setpoint − measurement) + integral/derivative terms + saturation. If you model it as an unconstrained continuous variable, you’ll miss (a) the heavy mass at 0/100, and (b) the sharp transitions.

Better methods/architectures

Learned PID emulator (very strong for KS)
- Train a small module that predicts u_t from (SP_t, PV_t, PV_{t-1}, u_{t-1}), with explicit clamp + rate-limit in the forward pass.
- Optionally learn (Kp, Ki, Kd) (global or regime-conditioned).
State-space / NARX model
- A small ARX/NARX (or Kalman-style) model often matches controller-output marginals better than diffusion because the dynamics are low-dimensional and constrained.

Type 3 — Actuator positions with “spiky”/rarely-changing distributions

Features (high KS / watchlist)

P1_PCV02Z (KS≈0.992) — current position of PCV02
P1_FCV02Z (KS≈0.706)
P1_PCV01D (KS≈0.557), P1_PCV01Z (KS≈0.559)

Why they break KS
Many valve positions are effectively mixtures of a few operating points + long dwell times. KS gets dominated by “missing spikes” even if your mean/std look fine.

Better methods/architectures

Mixture-of-deltas + small noise (spike-and-slab)
- Fit a small set of discrete operating points (k-means on values) and model:
  Z_t = discrete_state_t + ε, where discrete_state_t is categorical with dwell-time modeling.
Actuator dynamics model driven by command
- If you have a command channel (D), model Z as a constrained lag system:
  Z_{t+1} = Z_t + clip(α*(D_t − Z_t), -rate, +rate) + noise
- Learn α, rate per actuator/regime.
Regime-conditioned modeling
- Positions often behave differently under different operational regimes; conditioning on your generated “program state” from Type 1 helps a lot.

Type 4 — Quantized / “digital-as-continuous” variables

These don’t look like the top KS offenders right now, but they’re common future pain points and often worth splitting out early.

Examples present in your eval

Steam-turbine digital values like P4_ST_PT01, P4_ST_TT01, and P4_ST_GOV (dataset lists them as digital value / raw ranges).

Better methods

Generate latent continuous → quantize (round/clamp) to the exact integer range.
Or treat as ordinal discrete diffusion (mask diffusion over bins) instead of Gaussian DDPM.

Type 5 — Derived conversions (don’t model twice)

You have explicit “Z” conversions that are deterministic transforms of base signals.

Examples

P1_FT03Z is a conversion from P1_FT03
P1_FT01Z, P1_FT02Z are also “converted from” their base FT signals.

Best practice

Only generate the base physical variable (FT01/FT02/FT03) and compute the converted Z features deterministically.
This usually improves both KS and cross-feature consistency “for free”.

Type 6 — Auxiliary / vibration / HIL / power-supply signals

These are usually narrow-band / near-stationary and often easier with classic time-series models.

Examples

Vibration-related: P2_VXT02, P2_VXT03, P2_VYT02, P2_VYT03
P2_24Vdc, P2_HILout, etc.

Better methods

AR(1)/ARMA (or small state-space) + regime conditioning.
If you care about tails: Student-t innovations or simple GARCH (vibration).

Evaluate your “filter them out + model separately” idea

It’s a good move as long as you keep the dependency structure:

✅ Good: You stop a few “non-diffusion-friendly” channels (setpoints, spike valves, quantized registers) from dominating the aggregate KS and you can use the right model class per variable.
⚠️ Risk: If you model them independently, you may break joint realism (e.g., valve positions not matching setpoints, controller output not matching error).

The robust way to do it is a hierarchical generator:

Generate program/regime variables (Type 1) + mode flags
Generate controller/actuator variables conditioned on (1)
Generate process measurements conditioned on (1)(2) (your diffusion stays here)

Two immediate “highest ROI” fixes from your eval

Fix the constant-output collapse for P1_B4002 and P2_MSD (both have std=0).
If you split Type 1 out into a program generator, these usually stop being a diffusion problem entirely.
Handle spiky actuator positions with a spike-and-slab / dwell-time model (P1_PCV02Z ~ 0.992 KS).

If you want, paste the list of features you were planning to “exclude from overall KS”, and I’ll propose a concrete module boundary (what to keep in diffusion vs. what to move out) + an evaluation plan that still checks joint realism.

> 根据目前的架构，我们可以根据`eval.json`过滤出难以学习的features。这些主要是feature本身的含义和diffusion模式有冲突。DPPM貌似> 学不到event driven / regime driven的东西（虽然这些是discrete的，但是我很奇怪目前架构怎么处理discrete,只有masked吗？）刚刚我让gpt对features做的分类，参考了hai pdf上面的标注。一共分成6类。我建议对每一类单独规划解决方案。gpt 原回答： I pulled the **post-Jan-27 eval snapshot** you’re referring to (the one where you logged **KS ≈ 0.4046**). In that eval, a *small* set of features are still extreme outliers in **continuous_ks** (e.g., **P1_B4002 = 1.0**, **P2_MSD = 1.0**, **P1_PCV02Z ≈ 0.992**, **P1_B4005 ≈ 0.797**, **P4_HT_LD ≈ 0.717**). Two of them are outright **collapsed to constants** in the generated data (**std = 0.0**): **P1_B4002** and **P2_MSD**. Those alone can keep KS stubbornly high. Below is a practical classification of the “troublemakers” (and a small watchlist), plus **what to model them with instead**. --- ## Type 1 — Exogenous setpoints & demands (schedule-driven, piecewise-constant) **Features (high KS / watchlist)** * **P1_B4002 (KS=1.0)** — *High water level setpoint* * **P2_MSD (KS=1.0)** — corresponds to *Manual speed demand* in the dataset docs * **P4_HT_LD (KS≈0.717)** — *Load demand of HTM* * **P1_B2004 (KS≈0.695)** * **P1_B3004 (KS≈0.662)** * **P1_B3005 (KS≈0.564)** * **P1_B4022 (KS≈0.619)** **Why they break diffusion KS** These are not “plant responses”; they’re **operator/program inputs**. In HAI, *setpoint-change commands are delivered at the start of each day* (i.e., regime/schedule-like). Diffusion often either (a) blurs steps into ramps, or (b) collapses to a mean (your std=0 cases). **Better methods/architectures** 1. **HSMM / change-point program generator (recommended)** * Model them as **piecewise-constant segments**: sample `{num_changes, change_times, step_sizes}` + optional ramps. * HSMM/explicit duration modeling beats Transformer/GRU here because it matches “dwell time” statistics directly. 2. **Conditional “control-program first” hierarchy** * Generate *(setpoints, demands, mode flags)* first → then condition your diffusion on them to generate the process measurements. * This keeps the joint distribution sane while letting you use the right tool per variable. 3. **Empirical program library (fast baseline that often wins KS)** * Build a library of real daily SP/demand programs; during generation, sample one (or splice segments) and add small noise where appropriate. * Because these are exogenous, this is often acceptable and very hard to beat on KS. --- ## Type 2 — Controller outputs (policy-like, saturation/rate-limits) **Features** * **P1_B4005 (KS≈0.797)** — *Temperature PID control output* **Why it’s hard** Controller outputs are a **function of** `(setpoint − measurement)` + integral/derivative terms + **saturation**. If you model it as an unconstrained continuous variable, you’ll miss (a) the heavy mass at 0/100, and (b) the sharp transitions. **Better methods/architectures** 1. **Learned PID emulator (very strong for KS)** * Train a small module that predicts `u_t` from `(SP_t, PV_t, PV_{t-1}, u_{t-1})`, with explicit **clamp + rate-limit** in the forward pass. * Optionally learn `(Kp, Ki, Kd)` (global or regime-conditioned). 2. **State-space / NARX model** * A small ARX/NARX (or Kalman-style) model often matches controller-output marginals better than diffusion because the dynamics are low-dimensional and constrained. --- ## Type 3 — Actuator positions with “spiky”/rarely-changing distributions **Features (high KS / watchlist)** * **P1_PCV02Z (KS≈0.992)** — *current position of PCV02* * **P1_FCV02Z (KS≈0.706)** * **P1_PCV01D (KS≈0.557), P1_PCV01Z (KS≈0.559)** **Why they break KS** Many valve positions are effectively **mixtures of a few operating points** + long dwell times. KS gets dominated by “missing spikes” even if your mean/std look fine. **Better methods/architectures** 1. **Mixture-of-deltas + small noise (spike-and-slab)** * Fit a small set of discrete operating points (k-means on values) and model: `Z_t = discrete_state_t + ε`, where `discrete_state_t` is categorical with dwell-time modeling. 2. **Actuator dynamics model driven by command** * If you have a command channel (D), model `Z` as a constrained lag system: `Z_{t+1} = Z_t + clip(α*(D_t − Z_t), -rate, +rate) + noise` * Learn `α, rate` per actuator/regime. 3. **Regime-conditioned modeling** * Positions often behave differently under different operational regimes; conditioning on your generated “program state” from Type 1 helps a lot. --- ## Type 4 — Quantized / “digital-as-continuous” variables These don’t look like the top KS offenders *right now*, but they’re common future pain points and often worth splitting out early. **Examples present in your eval** * Steam-turbine digital values like **P4_ST_PT01**, **P4_ST_TT01**, and **P4_ST_GOV** (dataset lists them as *digital value* / raw ranges). **Better methods** * **Generate latent continuous → quantize** (round/clamp) to the exact integer range. * Or treat as **ordinal discrete diffusion** (mask diffusion over bins) instead of Gaussian DDPM. --- ## Type 5 — Derived conversions (don’t model twice) You have explicit “Z” conversions that are deterministic transforms of base signals. **Examples** * **P1_FT03Z** is a conversion from **P1_FT03** * **P1_FT01Z**, **P1_FT02Z** are also “converted from” their base FT signals. **Best practice** * **Only generate the base physical variable** (FT01/FT02/FT03) and compute the converted Z features deterministically. This usually improves both KS and cross-feature consistency “for free”. --- ## Type 6 — Auxiliary / vibration / HIL / power-supply signals These are usually narrow-band / near-stationary and often easier with classic time-series models. **Examples** * Vibration-related: **P2_VXT02**, **P2_VXT03**, **P2_VYT02**, **P2_VYT03** * **P2_24Vdc**, **P2_HILout**, etc. **Better methods** * **AR(1)/ARMA** (or small state-space) + regime conditioning. * If you care about tails: **Student-t innovations** or simple **GARCH** (vibration). --- # Evaluate your “filter them out + model separately” idea It’s a good move **as long as you keep the dependency structure**: * ✅ Good: You stop a few “non-diffusion-friendly” channels (setpoints, spike valves, quantized registers) from dominating the aggregate KS and you can use the right model class per variable. * ⚠️ Risk: If you model them independently, you may break joint realism (e.g., valve positions not matching setpoints, controller output not matching error). **The robust way to do it** is a **hierarchical generator**: 1. Generate **program/regime** variables (Type 1) + mode flags 2. Generate **controller/actuator** variables conditioned on (1) 3. Generate **process measurements** conditioned on (1)(2) (your diffusion stays here) --- ## Two immediate “highest ROI” fixes from your eval 1. **Fix the constant-output collapse** for P1_B4002 and P2_MSD (both have std=0). If you split Type 1 out into a program generator, these usually stop being a diffusion problem entirely. 2. **Handle spiky actuator positions with a spike-and-slab / dwell-time model** (P1_PCV02Z ~ 0.992 KS). If you want, paste the list of features you were planning to “exclude from overall KS”, and I’ll propose a concrete module boundary (what to keep in diffusion vs. what to move out) + an evaluation plan that still checks joint realism.

manbo commented

2026-01-28 18:30:00 +08:00

把6个types单独提取出来做了一张表，就先按这个分类来处理吧。

Below are the troublemaker continuous features (KS ≥ 0.50) from the current eval.json, grouped into the 6 types we discussed. (KS values shown are exactly what’s recorded in eval.json.)

Type 1 — Operator setpoints / demands (schedule-driven, piecewise-constant)

P1_B4002 — KS 1.0000
P2_MSD — KS 1.0000
P4_HT_LD — KS 0.7168
P1_B2004 — KS 0.6954
P1_B3004 — KS 0.6621
P1_B4022 — KS 0.6190
P1_B3005 — KS 0.5645

Suggestion: model these separately as an HMM / state-machine / scheduler, or make them exogenous conditioning (because the dataset’s normal operation explicitly uses an HMM to generate setpoint commands).

Type 2 — Controller internal/output variables (mid-loop, non-Gaussian)

P1_B4005 — KS 0.7971

Suggestion: learn as conditional model (condition on SP + PV + mode), or use a mixture density / flow head rather than a single Gaussian-style target.

Type 3 — Actuator command/position variables (saturation + spikes)

P1_PCV02Z — KS 0.9922
P1_PCV01Z — KS 0.5591
P1_PCV01D — KS 0.5571
P1_FCV02Z — KS 0.7062

Suggestion: model as state-conditional (e.g., separate “steady”, “ramping”, “saturated” regimes), or use discrete mode + continuous residual (a small regime classifier + conditional continuous generator).

Type 4 — Process-variable sensors (PV) with mode changes / heavy tails

P1_PIT02 — KS 0.5847
P2_SIT02 — KS 0.5010
P1_FT03 — KS 0.5241

Suggestion: try conditional normalizing flows or SSM-style (latent state) models; these PVs often become multi-modal because the plant operates in distinct regimes.

Type 5 — Derived / converted tags (should be near-deterministic)

P1_FT03Z — KS 0.5243

Suggestion: don’t “generate” it—recompute deterministically from its source measurement(s) in post-processing (or train it as a deterministic head), so it can’t drift distributionally.

Type 6 — HIL / cross-process coupling outputs

P4_HT_PO — KS 0.5586

Suggestion: model as a conditional/physics-informed head (condition on demand + key process PVs). These variables are driven by the simulator coupling, so unconditional diffusion often misses the correct mapping.

把6个types单独提取出来做了一张表，就先按这个分类来处理吧。 Below are the **troublemaker continuous features (KS ≥ 0.50)** from the *current* `eval.json`, **grouped into the 6 types** we discussed. (KS values shown are exactly what’s recorded in `eval.json`.) --- ## Type 1 — Operator setpoints / demands (schedule-driven, piecewise-constant) * **P1_B4002** — KS **1.0000** * **P2_MSD** — KS **1.0000** * **P4_HT_LD** — KS **0.7168** * **P1_B2004** — KS **0.6954** * **P1_B3004** — KS **0.6621** * **P1_B4022** — KS **0.6190** * **P1_B3005** — KS **0.5645** **Suggestion:** model these separately as an **HMM / state-machine / scheduler**, or make them **exogenous conditioning** (because the dataset’s normal operation explicitly uses an HMM to generate setpoint commands). --- ## Type 2 — Controller internal/output variables (mid-loop, non-Gaussian) * **P1_B4005** — KS **0.7971** **Suggestion:** learn as **conditional model** (condition on SP + PV + mode), or use a **mixture density / flow** head rather than a single Gaussian-style target. --- ## Type 3 — Actuator command/position variables (saturation + spikes) * **P1_PCV02Z** — KS **0.9922** * **P1_PCV01Z** — KS **0.5591** * **P1_PCV01D** — KS **0.5571** * **P1_FCV02Z** — KS **0.7062** **Suggestion:** model as **state-conditional** (e.g., separate “steady”, “ramping”, “saturated” regimes), or use **discrete mode + continuous residual** (a small regime classifier + conditional continuous generator). --- ## Type 4 — Process-variable sensors (PV) with mode changes / heavy tails * **P1_PIT02** — KS **0.5847** * **P2_SIT02** — KS **0.5010** * **P1_FT03** — KS **0.5241** **Suggestion:** try **conditional normalizing flows** or **SSM-style** (latent state) models; these PVs often become multi-modal because the plant operates in distinct regimes. --- ## Type 5 — Derived / converted tags (should be near-deterministic) * **P1_FT03Z** — KS **0.5243** **Suggestion:** don’t “generate” it—**recompute deterministically** from its source measurement(s) in post-processing (or train it as a deterministic head), so it can’t drift distributionally. --- ## Type 6 — HIL / cross-process coupling outputs * **P4_HT_PO** — KS **0.5586** **Suggestion:** model as a **conditional/physics-informed** head (condition on demand + key process PVs). These variables are driven by the simulator coupling, so unconditional diffusion often misses the correct mapping. ---

manbo closed this issue

2026-02-05 11:01:24 +08:00

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: ModuFlow/mask-ddpm#2