Rewrite report with full project documentation
This commit is contained in:
524
report.md
524
report.md
@@ -1,338 +1,308 @@
|
|||||||
# Hybrid Diffusion for ICS Traffic (HAI 21.03) — Project Report
|
# mask-ddpm Project Report (Detailed)
|
||||||
# 工业控制系统流量混合扩散生成(HAI 21.03)— 项目报告
|
|
||||||
|
This report is a **complete, beginner‑friendly** description of the current project implementation as of the latest code in this repo. It explains **what the project does**, **how data flows**, **what each file is for**, and **why the architecture is designed this way**.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 0. TL;DR / 一句话概览
|
||||||
|
|
||||||
|
We generate multivariate ICS time‑series by **(1) learning temporal trend with GRU** and **(2) learning residuals with a hybrid diffusion model** (continuous DDPM + discrete masked diffusion). We then evaluate with **tie‑aware KS** and run **Type‑aware postprocessing** for diagnostic KS reduction.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## 1. Project Goal / 项目目标
|
## 1. Project Goal / 项目目标
|
||||||
Build a **hybrid diffusion-based generator** for ICS traffic features, focusing on **mixed continuous + discrete** feature sequences. The output is **feature-level sequences**, not raw packets. The generator should preserve:
|
|
||||||
- **Distributional fidelity** (continuous ranges + discrete frequencies)
|
|
||||||
- **Temporal consistency** (time correlation and sequence structure)
|
|
||||||
- **Field/logic consistency** for discrete protocol-like columns
|
|
||||||
|
|
||||||
构建一个用于 ICS 流量特征的**混合扩散生成模型**,处理**连续+离散混合特征序列**。输出为**特征级序列**而非原始报文。生成结果需要保持:
|
We want synthetic ICS sequences that are:
|
||||||
- **分布一致性**(连续值范围 + 离散频率)
|
1) **Distribution‑aligned** (per‑feature CDF matches real data → low KS)
|
||||||
- **时序一致性**(时间相关性与序列结构)
|
2) **Temporally consistent** (lag‑1 correlation and trend are realistic)
|
||||||
- **字段/逻辑一致性**(离散字段语义)
|
3) **Discrete‑valid** (state tokens are legal and frequency‑consistent)
|
||||||
|
|
||||||
|
This is hard because **distribution** and **temporal structure** often conflict in a single model.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 2. Data and Scope / 数据与范围
|
## 2. Data & Feature Schema / 数据与特征结构
|
||||||
**Dataset used in current implementation:** HAI 21.03 (CSV feature traces).
|
|
||||||
|
|
||||||
**当前实现使用数据集:** HAI 21.03(CSV 特征序列)。
|
**Input data**: HAI CSV files (compressed) in `dataset/hai/hai-21.03/`.
|
||||||
|
|
||||||
**Data path (default in config):**
|
**Feature split**: `example/feature_split.json`
|
||||||
- `dataset/hai/hai-21.03/train*.csv.gz`
|
- `continuous`: real‑valued sensors/actuators
|
||||||
|
- `discrete`: state tokens / modes
|
||||||
**特征拆分(固定 schema):** `example/feature_split.json`
|
- `time_column`: time index (not trained)
|
||||||
- Continuous features: sensor/process values
|
|
||||||
- Discrete features: binary/low-cardinality status/flag fields
|
|
||||||
- `time` is excluded from modeling
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 3. End-to-End Pipeline / 端到端流程
|
## 3. Preprocessing / 预处理
|
||||||
One command pipeline:
|
|
||||||
```
|
|
||||||
python example/run_all.py --device cuda
|
|
||||||
```
|
|
||||||
Full pipeline + diagnostics:
|
|
||||||
```
|
|
||||||
python example/run_all_full.py --device cuda
|
|
||||||
```
|
|
||||||
|
|
||||||
Pipeline stages:
|
File: `example/prepare_data.py`
|
||||||
1) **Prepare data** (`example/prepare_data.py`)
|
|
||||||
2) **Train temporal backbone** (`example/train.py`, stage 1)
|
|
||||||
3) **Train diffusion on residuals** (`example/train.py`, stage 2)
|
|
||||||
4) **Generate samples** (`example/export_samples.py`)
|
|
||||||
5) **Evaluate** (`example/evaluate_generated.py`)
|
|
||||||
|
|
||||||
一键流程对应:数据准备 → 时序骨干训练 → 残差扩散训练 → 采样导出 → 评估。
|
### Continuous features
|
||||||
|
- Mean/std statistics
|
||||||
|
- Quantile table (if `use_quantile_transform=true`)
|
||||||
|
- Optional transforms (log1p etc.)
|
||||||
|
- Output: `example/results/cont_stats.json`
|
||||||
|
|
||||||
|
### Discrete features
|
||||||
|
- Token vocab from data
|
||||||
|
- Output: `example/results/disc_vocab.json`
|
||||||
|
|
||||||
|
File: `example/data_utils.py` contains
|
||||||
|
- Normalization / inverse
|
||||||
|
- Quantile transform / inverse
|
||||||
|
- Post‑calibration helpers
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 4. Technical Architecture / 技术架构
|
## 4. Architecture / 模型结构
|
||||||
|
|
||||||
### 4.1 Hybrid Diffusion Model (Core) / 混合扩散模型(核心)
|
### 4.1 Stage‑1 Temporal GRU (Trend)
|
||||||
Defined in `example/hybrid_diffusion.py`.
|
File: `example/hybrid_diffusion.py`
|
||||||
|
- Class: `TemporalGRUGenerator`
|
||||||
|
- Input: continuous sequence
|
||||||
|
- Output: **trend sequence** (teacher forced)
|
||||||
|
- Purpose: capture temporal structure
|
||||||
|
|
||||||
**Inputs:**
|
### 4.2 Stage‑2 Hybrid Diffusion (Residual)
|
||||||
- Continuous projection
|
File: `example/hybrid_diffusion.py`
|
||||||
- Discrete embeddings
|
|
||||||
- Time embedding (sinusoidal)
|
|
||||||
- Positional embedding (sequence index)
|
|
||||||
- Optional condition embedding (`file_id`)
|
|
||||||
|
|
||||||
**Backbone (configurable):**
|
**Continuous branch**
|
||||||
- GRU (sequence modeling)
|
- Gaussian DDPM
|
||||||
- Transformer encoder (self‑attention)
|
- Predicts **residual** (or noise)
|
||||||
- Post LayerNorm + residual MLP
|
|
||||||
|
|
||||||
**Current default config (latest):**
|
**Discrete branch**
|
||||||
- Backbone: Transformer
|
- Mask diffusion (masked tokens)
|
||||||
- Sequence length: 96
|
- Classifier head per discrete column
|
||||||
- Batch size: 16
|
|
||||||
|
|
||||||
**Outputs:**
|
**Backbone**
|
||||||
- Continuous head: predicts target (`eps` or `x0`)
|
- Current config uses **Transformer encoder** (`backbone_type=transformer`)
|
||||||
- Discrete heads: logits per discrete column
|
- GRU is still supported as option
|
||||||
|
|
||||||
**连续分支:** Gaussian diffusion
|
**Conditioning**
|
||||||
**离散分支:** Mask diffusion
|
- File‑id conditioning (`use_condition=true`, `condition_type=file_id`)
|
||||||
|
- Type‑1 (setpoint/demand) can be passed as **continuous condition** (`cond_cont`)
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### 4.2 Stage-1 Temporal Model (GRU) / 第一阶段时序模型(GRU)
|
## 5. Training Flow / 训练流程
|
||||||
A separate GRU models the **trend backbone** of continuous features. It is trained first using teacher forcing to predict the next step.
|
File: `example/train.py`
|
||||||
|
|
||||||
独立的 GRU 先学习连续特征的**趋势骨架**,使用 teacher forcing 进行逐步预测。
|
### 5.1 Stage‑1 Temporal training
|
||||||
|
- Use continuous features (excluding Type1/Type5)
|
||||||
|
- Teacher‑forced GRU predicts next step
|
||||||
|
- Loss: **MSE**
|
||||||
|
- Output: `temporal.pt`
|
||||||
|
|
||||||
Trend definition:
|
### 5.2 Stage‑2 Diffusion training
|
||||||
```
|
- Compute residual: `x_resid = x_cont - trend`
|
||||||
trend = GRU(x)
|
- Sample time step `t`
|
||||||
residual = x - trend
|
- Add noise for continuous; mask tokens for discrete
|
||||||
```
|
- Model predicts:
|
||||||
|
- **eps_pred** for continuous residual
|
||||||
|
- logits for discrete tokens
|
||||||
|
|
||||||
**Two-stage training:** temporal GRU first, diffusion on residuals.
|
### Loss design
|
||||||
|
- Continuous loss: MSE on eps or x0 (`cont_target`)
|
||||||
### 4.3 Feature-Type Aware Strategy / 特征类型分治方案
|
- Optional weighting: inverse variance (`cont_loss_weighting=inv_std`)
|
||||||
Based on HAI feature semantics and observed KS outliers, we classify problematic features into six types and plan separate modeling paths:
|
- Optional SNR weighting (`snr_weighted_loss`)
|
||||||
|
- Optional quantile loss (align residual distribution)
|
||||||
1) **Type 1: Exogenous setpoints / demands** (schedule-driven, piecewise-constant)
|
- Optional residual mean/std loss
|
||||||
Examples: P1_B4002, P2_MSD, P4_HT_LD
|
- Discrete loss: cross‑entropy on masked tokens
|
||||||
Strategy: program generator (HSMM / change-point), or sample from program library; condition diffusion on these.
|
- Total: `loss = λ * loss_cont + (1‑λ) * loss_disc`
|
||||||
|
|
||||||
2) **Type 2: Controller outputs** (policy-like, saturation / rate limits)
|
|
||||||
Example: P1_B4005
|
|
||||||
Strategy: small controller emulator (PID/NARX) with clamp + rate-limit.
|
|
||||||
|
|
||||||
3) **Type 3: Spiky actuators** (few operating points + long dwell)
|
|
||||||
Examples: P1_PCV02Z, P1_FCV02Z
|
|
||||||
Strategy: spike-and-slab + dwell-time modeling or command‑driven actuator dynamics.
|
|
||||||
|
|
||||||
4) **Type 4: Quantized / digital-as-continuous**
|
|
||||||
Examples: P4_ST_PT01, P4_ST_TT01
|
|
||||||
Strategy: generate latent continuous then quantize or treat as ordinal discrete diffusion.
|
|
||||||
|
|
||||||
5) **Type 5: Derived conversions**
|
|
||||||
Examples: *FT* → *FTZ*
|
|
||||||
Strategy: generate base variable and derive conversions deterministically.
|
|
||||||
|
|
||||||
6) **Type 6: Aux / vibration / narrow-band**
|
|
||||||
Examples: P2_24Vdc, P2_HILout
|
|
||||||
Strategy: AR/ARMA or regime‑conditioned narrow-band models.
|
|
||||||
|
|
||||||
### 4.4 Module Boundaries / 模块边界
|
|
||||||
- **Program generator** outputs Type‑1 variables (setpoints/demands).
|
|
||||||
- **Controller/actuator modules** output Type‑2/3 variables conditioned on Type‑1.
|
|
||||||
- **Diffusion** generates remaining continuous PVs + discrete features.
|
|
||||||
- **Post‑processing** reconstructs Type‑5 derived tags and applies calibration.
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 5. Diffusion Formulations / 扩散形式
|
## 6. Sampling & Export / 采样与导出
|
||||||
|
File: `example/export_samples.py`
|
||||||
|
|
||||||
### 5.1 Continuous Diffusion / 连续扩散
|
Steps:
|
||||||
Forward process on residuals:
|
1) Initialize continuous with noise
|
||||||
```
|
2) Initialize discrete with masks
|
||||||
r_t = sqrt(a_bar_t) * r + sqrt(1 - a_bar_t) * eps
|
3) Reverse diffusion loop from `t=T..0`
|
||||||
```
|
4) Add trend back (if temporal stage enabled)
|
||||||
|
5) Inverse transforms (quantile → raw)
|
||||||
Targets supported:
|
6) Clip/bound if configured
|
||||||
- **eps prediction**
|
7) Merge back Type1 (conditioning) and Type5 (derived)
|
||||||
- **x0 prediction** (default)
|
8) Write `generated.csv`
|
||||||
|
|
||||||
Current config:
|
|
||||||
```
|
|
||||||
"cont_target": "x0"
|
|
||||||
```
|
|
||||||
|
|
||||||
### 5.2 Discrete Diffusion / 离散扩散
|
|
||||||
Mask diffusion with cosine schedule:
|
|
||||||
```
|
|
||||||
p(t) = 0.5 * (1 - cos(pi * t / T))
|
|
||||||
```
|
|
||||||
Mask-only cross-entropy is computed on masked positions.
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 6. Loss Design / 损失设计
|
## 7. Evaluation / 评估
|
||||||
Total loss:
|
File: `example/evaluate_generated.py`
|
||||||
```
|
|
||||||
L = λ * L_cont + (1 − λ) * L_disc
|
### Metrics
|
||||||
|
- **KS (tie‑aware)** for continuous
|
||||||
|
- **JSD** for discrete
|
||||||
|
- **lag‑1 correlation** for temporal consistency
|
||||||
|
- quantile diffs, mean/std errors
|
||||||
|
|
||||||
|
### Important
|
||||||
|
- Reference supports **glob** and aggregates **all matched files**
|
||||||
|
- KS implementation is **tie‑aware** (correct for spiky/quantized data)
|
||||||
|
|
||||||
|
Outputs:
|
||||||
|
- `example/results/eval.json`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 8. Diagnostics / 诊断工具
|
||||||
|
|
||||||
|
- `example/diagnose_ks.py`: CDF plots and per‑feature KS
|
||||||
|
- `example/ranked_ks.py`: ranked KS + contribution
|
||||||
|
- `example/filtered_metrics.py`: filtered KS excluding outliers
|
||||||
|
- `example/program_stats.py`: Type‑1 stats
|
||||||
|
- `example/controller_stats.py`: Type‑2 stats
|
||||||
|
- `example/actuator_stats.py`: Type‑3 stats
|
||||||
|
- `example/pv_stats.py`: Type‑4 stats
|
||||||
|
- `example/aux_stats.py`: Type‑6 stats
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 9. Type‑Aware Modeling / 类型化分离
|
||||||
|
|
||||||
|
To reduce KS dominated by a few variables, the project uses **Type categories** defined in config:
|
||||||
|
- **Type1**: setpoints / demand (schedule‑driven)
|
||||||
|
- **Type2**: controller outputs
|
||||||
|
- **Type3**: actuator positions
|
||||||
|
- **Type4**: PV sensors
|
||||||
|
- **Type5**: derived tags
|
||||||
|
- **Type6**: auxiliary / coupling
|
||||||
|
|
||||||
|
### Current implementation (diagnostic KS baseline)
|
||||||
|
File: `example/postprocess_types.py`
|
||||||
|
- Type1/2/3/5/6 → **empirical resampling** from real distribution
|
||||||
|
- Type4 → keep diffusion output
|
||||||
|
|
||||||
|
This is **not** the final model, but provides a **KS‑upper bound** for diagnosis.
|
||||||
|
|
||||||
|
Outputs:
|
||||||
|
- `example/results/generated_post.csv`
|
||||||
|
- `example/results/eval_post.json`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 10. Pipeline / 一键流程
|
||||||
|
|
||||||
|
File: `example/run_all.py`
|
||||||
|
|
||||||
|
Default pipeline:
|
||||||
|
1) prepare_data
|
||||||
|
2) train
|
||||||
|
3) export_samples
|
||||||
|
4) evaluate_generated (generated.csv)
|
||||||
|
5) postprocess_types (generated_post.csv)
|
||||||
|
6) evaluate_generated (eval_post.json)
|
||||||
|
7) diagnostics scripts
|
||||||
|
|
||||||
|
**Linux**:
|
||||||
|
```bash
|
||||||
|
python example/run_all.py --device cuda --config example/config.json
|
||||||
```
|
```
|
||||||
|
|
||||||
### 6.1 Continuous Loss / 连续损失
|
**Windows (PowerShell)**:
|
||||||
- `eps` target: MSE(eps_pred, eps)
|
```powershell
|
||||||
- `x0` target: MSE(x0_pred, x0)
|
python run_all.py --device cuda --config config.json
|
||||||
- Optional inverse-variance weighting: `cont_loss_weighting = "inv_std"`
|
|
||||||
- Optional **SNR-weighted loss**: reweights MSE by SNR to stabilize diffusion training
|
|
||||||
|
|
||||||
### 6.2 Discrete Loss / 离散损失
|
|
||||||
Cross-entropy on masked positions only.
|
|
||||||
|
|
||||||
### 6.3 Temporal Loss / 时序损失
|
|
||||||
Stage‑1 GRU predicts next step:
|
|
||||||
```
|
|
||||||
L_temporal = MSE(pred_next, x[:,1:])
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### 6.4 Residual Alignment Losses / 残差对齐损失
|
---
|
||||||
- **Quantile loss** on residuals to align distribution tails.
|
|
||||||
- **Residual mean/std penalty** to reduce drift and improve KS.
|
## 11. Current Configuration (Key Defaults)
|
||||||
|
From `example/config.json`:
|
||||||
|
- backbone_type: **transformer**
|
||||||
|
- timesteps: 600
|
||||||
|
- seq_len: 96
|
||||||
|
- batch_size: 16
|
||||||
|
- cont_target: `x0`
|
||||||
|
- cont_loss_weighting: `inv_std`
|
||||||
|
- snr_weighted_loss: true
|
||||||
|
- quantile_loss_weight: 0.2
|
||||||
|
- use_quantile_transform: true
|
||||||
|
- cont_post_calibrate: true
|
||||||
|
- use_temporal_stage1: true
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 7. Data Processing / 数据处理
|
## 12. What’s Actually Trained vs What’s Post‑Processed
|
||||||
Defined in `example/data_utils.py` + `example/prepare_data.py`.
|
|
||||||
|
|
||||||
Key steps:
|
**Trained**
|
||||||
- Streaming mean/std/min/max + int-like detection
|
- Temporal GRU (trend)
|
||||||
- Optional **log1p transform** for heavy-tailed continuous columns
|
- Diffusion residual model (continuous + discrete)
|
||||||
- Optional **quantile transform** (TabDDPM-style) for continuous columns (skips extra standardization)
|
|
||||||
- **Full quantile stats** (full_stats) for stable calibration
|
**Post‑Processed (KS‑only)**
|
||||||
- Optional **post-hoc quantile calibration** to align 1D CDFs after sampling
|
- Type1/2/3/5/6 replaced by empirical resampling
|
||||||
- Discrete vocab + most frequent token
|
|
||||||
- Windowed batching with **shuffle buffer**
|
This is important: postprocess improves KS but **may break joint realism**.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 8. Sampling & Export / 采样与导出
|
## 13. Why It’s Still Hard / 当前难点
|
||||||
Defined in:
|
|
||||||
- `example/sample.py`
|
|
||||||
- `example/export_samples.py`
|
|
||||||
|
|
||||||
Export process:
|
- Type1/2/3 are **event‑driven** and **piecewise constant**
|
||||||
- Generate trend using temporal GRU
|
- Diffusion (Gaussian DDPM + MSE) tends to smooth/blur these
|
||||||
- Diffusion generates residuals
|
- Temporal vs distribution objectives pull in opposite directions
|
||||||
- Output: `trend + residual`
|
|
||||||
- De-normalize continuous values
|
|
||||||
- Inverse quantile transform (if enabled; no extra de-standardization)
|
|
||||||
- Optional post-hoc quantile calibration (if enabled)
|
|
||||||
- Bound to observed min/max (clamp / sigmoid / soft_tanh / none)
|
|
||||||
- Restore discrete tokens from vocab
|
|
||||||
- Write to CSV
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 9. Evaluation / 评估指标
|
## 14. Where To Improve Next / 下一步方向
|
||||||
Defined in `example/evaluate_generated.py`.
|
|
||||||
|
|
||||||
Metrics (with reference):
|
1) Replace KS‑only postprocess with **conditional generators**:
|
||||||
- **KS statistic** (continuous distribution)
|
- Type1: program generator (HMM / schedule)
|
||||||
- **Quantile diffs** (q05/q25/q50/q75/q95)
|
- Type2: controller emulator (PID‑like)
|
||||||
- **Lag‑1 correlation diff** (temporal structure)
|
- Type3: actuator dynamics (dwell + rate + saturation)
|
||||||
- **Discrete JSD** over vocab frequency
|
|
||||||
- **Invalid token counts**
|
|
||||||
|
|
||||||
**指标汇总与对比脚本:** `example/summary_metrics.py`
|
2) Add regime conditioning for Type4 PVs
|
||||||
- 输出 avg_ks / avg_jsd / avg_lag1_diff
|
|
||||||
- 追加记录到 `example/results/metrics_history.csv`
|
|
||||||
- 如果存在上一次记录,输出 delta(新旧对比)
|
|
||||||
|
|
||||||
**分布诊断脚本(逐特征 KS/CDF):** `example/diagnose_ks.py`
|
3) Joint realism checks (cross‑feature correlation)
|
||||||
- 输出 `example/results/ks_per_feature.csv`(每个连续特征 KS)
|
|
||||||
- 输出 `example/results/cdf_<feature>.svg`(真实 vs 生成 CDF)
|
|
||||||
- 统计生成数据是否堆积在边界(gen_frac_at_min / gen_frac_at_max)
|
|
||||||
|
|
||||||
**Filtered KS(剔除难以学习特征,仅用于诊断):** `example/filtered_metrics.py`
|
|
||||||
- 规则:std 过小或 KS 过高自动剔除
|
|
||||||
- 输出 `example/results/filtered_metrics.json`
|
|
||||||
- 只用于诊断,不作为最终指标
|
|
||||||
|
|
||||||
**Ranked KS(特征贡献排序):** `example/ranked_ks.py`
|
|
||||||
- 输出 `example/results/ranked_ks.csv`
|
|
||||||
- 计算每个特征对 avg_ks 的贡献,以及“移除前 N 个特征后”的 avg_ks
|
|
||||||
|
|
||||||
**Program stats(setpoint/demand 统计):** `example/program_stats.py`
|
|
||||||
- 输出 `example/results/program_stats.json`
|
|
||||||
- 指标:change-count / dwell / step-size(对比生成 vs 真实)
|
|
||||||
|
|
||||||
**Controller stats(Type‑2 控制量):** `example/controller_stats.py`
|
|
||||||
- 输出 `example/results/controller_stats.json`
|
|
||||||
- 指标:饱和比例 / 变化率 / 步长中位数
|
|
||||||
|
|
||||||
**Actuator stats(Type‑3 执行器):** `example/actuator_stats.py`
|
|
||||||
- 输出 `example/results/actuator_stats.json`
|
|
||||||
- 指标:峰值占比 / unique ratio / dwell
|
|
||||||
|
|
||||||
**PV stats(Type‑4 传感器):** `example/pv_stats.py`
|
|
||||||
- 输出 `example/results/pv_stats.json`
|
|
||||||
- 指标:q05/q50/q95 + tail ratio
|
|
||||||
|
|
||||||
**Aux stats(Type‑6 辅助量):** `example/aux_stats.py`
|
|
||||||
- 输出 `example/results/aux_stats.json`
|
|
||||||
- 指标:均值/方差/lag‑1
|
|
||||||
|
|
||||||
**Type-based postprocess:** `example/postprocess_types.py`
|
|
||||||
- 输出 `example/results/generated_post.csv`
|
|
||||||
- 用 Type‑1/2/3/5/6 规则重建部分列(无需训练)
|
|
||||||
- KS-only baseline: Type1/2/3/5/6 经验重采样(只为压 KS,可能破坏联合分布)
|
|
||||||
**Evaluation protocol:** see `docs/evaluation.md`.
|
|
||||||
|
|
||||||
Recent runs (Windows):
|
|
||||||
- 2026-01-27 21:22:34 — avg_ks 0.4046 / avg_jsd 0.0376 / avg_lag1_diff 0.1449
|
|
||||||
Recent runs (WSL, diagnostic):
|
|
||||||
- 2026-01-28 — KS-only postprocess baseline (full-reference, tie-aware KS): overall_avg_ks 0.2851
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 10. Automation / 自动化
|
## 15. Key Files (Complete but Pruned)
|
||||||
`example/run_all.py` runs prepare/train/export/eval + postprocess + diagnostics in one command.
|
|
||||||
`example/run_all_full.py` legacy full runner.
|
```
|
||||||
`example/run_compare.py` can run a baseline vs temporal config and compute metric deltas.
|
mask-ddpm/
|
||||||
|
report.md
|
||||||
|
docs/
|
||||||
|
README.md
|
||||||
|
architecture.md
|
||||||
|
evaluation.md
|
||||||
|
decisions.md
|
||||||
|
experiments.md
|
||||||
|
ideas.md
|
||||||
|
example/
|
||||||
|
config.json
|
||||||
|
config_no_temporal.json
|
||||||
|
config_temporal_strong.json
|
||||||
|
feature_split.json
|
||||||
|
data_utils.py
|
||||||
|
prepare_data.py
|
||||||
|
hybrid_diffusion.py
|
||||||
|
train.py
|
||||||
|
sample.py
|
||||||
|
export_samples.py
|
||||||
|
evaluate_generated.py
|
||||||
|
run_all.py
|
||||||
|
run_compare.py
|
||||||
|
diagnose_ks.py
|
||||||
|
filtered_metrics.py
|
||||||
|
ranked_ks.py
|
||||||
|
program_stats.py
|
||||||
|
controller_stats.py
|
||||||
|
actuator_stats.py
|
||||||
|
pv_stats.py
|
||||||
|
aux_stats.py
|
||||||
|
postprocess_types.py
|
||||||
|
results/
|
||||||
|
generated.csv
|
||||||
|
generated_post.csv
|
||||||
|
eval.json
|
||||||
|
eval_post.json
|
||||||
|
cont_stats.json
|
||||||
|
disc_vocab.json
|
||||||
|
metrics_history.csv
|
||||||
|
```
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 11. Key Engineering Decisions / 关键工程决策
|
## 16. Summary / 总结
|
||||||
- Mixed-type diffusion: continuous + discrete split
|
|
||||||
- Two-stage training: temporal backbone first, diffusion on residuals
|
|
||||||
- Switchable backbone: GRU vs Transformer encoder for the diffusion model
|
|
||||||
- Positional + time embeddings for stability
|
|
||||||
- Optional inverse-variance weighting for continuous loss
|
|
||||||
- Log1p transforms for heavy-tailed signals
|
|
||||||
- Quantile transform + post-hoc calibration to stabilize CDF alignment
|
|
||||||
|
|
||||||
---
|
The current project is a **hybrid diffusion system** with a **two‑stage temporal+residual design**, built to balance **distribution alignment** and **temporal realism**. The architecture is modular, with explicit type‑aware diagnostics and postprocessing, and supports both GRU and Transformer backbones. The remaining research challenge is to replace KS‑only postprocessing with **conditional, structurally consistent generators** for Type1/2/3/5/6 features.
|
||||||
|
|
||||||
## 12. Code Map (Key Files) / 代码索引
|
|
||||||
- Core model: `example/hybrid_diffusion.py`
|
|
||||||
- Training: `example/train.py`
|
|
||||||
- Temporal GRU: `example/hybrid_diffusion.py` (`TemporalGRUGenerator`)
|
|
||||||
- Data prep: `example/prepare_data.py`
|
|
||||||
- Data utilities: `example/data_utils.py`
|
|
||||||
- Sampling: `example/sample.py`
|
|
||||||
- Export: `example/export_samples.py`
|
|
||||||
- Evaluation: `example/evaluate_generated.py`
|
|
||||||
- KS uses tie-aware implementation and aggregates all reference files matched by glob.
|
|
||||||
- Pipeline: `example/run_all.py`
|
|
||||||
- Config: `example/config.json`
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 13. Known Issues / Current Limitations / 已知问题
|
|
||||||
- KS can remain high on a subset of features → per-feature diagnosis required
|
|
||||||
- Lag‑1 may fluctuate → distribution vs temporal trade-off
|
|
||||||
- Discrete JSD can regress when continuous KS is prioritized
|
|
||||||
- Transformer backbone may change stability; needs systematic comparison
|
|
||||||
- Program/actuator features require specialized modeling beyond diffusion
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 14. Suggested Next Steps / 下一步建议
|
|
||||||
- Compare GRU vs Transformer backbone using `run_compare.py`
|
|
||||||
- Explore **v‑prediction** for continuous branch
|
|
||||||
- Strengthen discrete diffusion (e.g., D3PM-style transitions)
|
|
||||||
- Add targeted discrete calibration for high‑JSD columns
|
|
||||||
- Implement program generator for Type‑1 features and evaluate with dwell/step metrics
|
|
||||||
|
|
||||||
## 16. Deliverables / 交付清单
|
|
||||||
- Code: diffusion + temporal + diagnostics + pipeline scripts
|
|
||||||
- Docs: report + decisions + experiments + architecture + evaluation protocol
|
|
||||||
- Results: full metrics, filtered metrics, ranked KS, per‑feature CDFs
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 15. Summary / 总结
|
|
||||||
This project implements a **two-stage hybrid diffusion model** for ICS feature sequences: a GRU-based temporal backbone first models sequence trends, then diffusion learns residual corrections. The pipeline covers data prep, two-stage training, sampling, export, and evaluation. The main research challenge remains in balancing **distributional fidelity (KS)** and **temporal consistency (lag‑1)**.
|
|
||||||
|
|
||||||
本项目实现了**两阶段混合扩散模型**:先用 GRU 时序骨干学习趋势,再用扩散学习残差校正。系统包含完整训练与评估流程。主要挑战仍是**分布对齐(KS)与时序一致性(lag‑1)之间的平衡**。
|
|
||||||
|
|||||||
Reference in New Issue
Block a user