Update .gitignore

Merge branch 'master' of https://gitea.markyan04.cn/ModuFlow/internal-docs
More Info
2026-02-04 18:17:11 +08:00 · 2026-02-03 22:58:43 +08:00 · 2026-02-03 22:58:33 +08:00
12 changed files with 276 additions and 1254 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -1,7 +1,2 @@
 papers/Topic8 Network Traffic Generation/NetDiffusion Network Data Augmentation Through Protocol-Constrained Traffic Generation/NetDiffusion Network Data Augmentation Through Protocol-Constrained Traffic Gener_1_33_translate_20260130214700.pdf
-arxiv-style/*.pdf
+.DS_Store
 arxiv-style/*.aux
 arxiv-style/*.log
 arxiv-style/*.blg
 arxiv-style/*.bbl
 arxiv-style/*.out
--- a/arxiv-style/arxiv.sty
+++ b/arxiv-style/arxiv.sty
@@ -1,262 +0,0 @@
 \NeedsTeXFormat{LaTeX2e}
 \ProcessOptions\relax
 % fonts
 \renewcommand{\rmdefault}{ptm}
 \renewcommand{\sfdefault}{phv}
 % set page geometry
 \usepackage[verbose=true,letterpaper]{geometry}
 \AtBeginDocument{
  \newgeometry{
    textheight=9in,
    textwidth=6.5in,
    top=1in,
    headheight=14pt,
    headsep=25pt,
    footskip=30pt
  }
 }
 \widowpenalty=10000
 \clubpenalty=10000
 \flushbottom
 \sloppy
 \newcommand{\headeright}{A Preprint}
 \newcommand{\undertitle}{A Preprint}
 \newcommand{\shorttitle}{\@title}
 \usepackage{fancyhdr}
 \fancyhf{}
 \pagestyle{fancy}
 \renewcommand{\headrulewidth}{0.4pt}
 \fancyheadoffset{0pt}
 \rhead{\scshape \footnotesize \headeright}
 \chead{\shorttitle}
 \cfoot{\thepage}
 %Handling Keywords
 \def\keywordname{{\bfseries \emph{Keywords}}}%
 \def\keywords#1{\par\addvspace\medskipamount{\rightskip=0pt plus1cm
 \def\and{\ifhmode\unskip\nobreak\fi\ $\cdot$
 }\noindent\keywordname\enspace\ignorespaces#1\par}}
 % font sizes with reduced leading
 \renewcommand{\normalsize}{%
  \@setfontsize\normalsize\@xpt\@xipt
  \abovedisplayskip      7\p@ \@plus 2\p@ \@minus 5\p@
  \abovedisplayshortskip \z@ \@plus 3\p@
  \belowdisplayskip      \abovedisplayskip
  \belowdisplayshortskip 4\p@ \@plus 3\p@ \@minus 3\p@
 }
 \normalsize
 \renewcommand{\small}{%
  \@setfontsize\small\@ixpt\@xpt
  \abovedisplayskip      6\p@ \@plus 1.5\p@ \@minus 4\p@
  \abovedisplayshortskip \z@  \@plus 2\p@
  \belowdisplayskip      \abovedisplayskip
  \belowdisplayshortskip 3\p@ \@plus 2\p@   \@minus 2\p@
 }
 \renewcommand{\footnotesize}{\@setfontsize\footnotesize\@ixpt\@xpt}
 \renewcommand{\scriptsize}{\@setfontsize\scriptsize\@viipt\@viiipt}
 \renewcommand{\tiny}{\@setfontsize\tiny\@vipt\@viipt}
 \renewcommand{\large}{\@setfontsize\large\@xiipt{14}}
 \renewcommand{\Large}{\@setfontsize\Large\@xivpt{16}}
 \renewcommand{\LARGE}{\@setfontsize\LARGE\@xviipt{20}}
 \renewcommand{\huge}{\@setfontsize\huge\@xxpt{23}}
 \renewcommand{\Huge}{\@setfontsize\Huge\@xxvpt{28}}
 % sections with less space
 \providecommand{\section}{}
 \renewcommand{\section}{%
  \@startsection{section}{1}{\z@}%
                {-2.0ex \@plus -0.5ex \@minus -0.2ex}%
                { 1.5ex \@plus  0.3ex \@minus  0.2ex}%
                {\large\bf\raggedright}%
 }
 \providecommand{\subsection}{}
 \renewcommand{\subsection}{%
  \@startsection{subsection}{2}{\z@}%
                {-1.8ex \@plus -0.5ex \@minus -0.2ex}%
                { 0.8ex \@plus  0.2ex}%
                {\normalsize\bf\raggedright}%
 }
 \providecommand{\subsubsection}{}
 \renewcommand{\subsubsection}{%
  \@startsection{subsubsection}{3}{\z@}%
                {-1.5ex \@plus -0.5ex \@minus -0.2ex}%
                { 0.5ex \@plus  0.2ex}%
                {\normalsize\bf\raggedright}%
 }
 \providecommand{\paragraph}{}
 \renewcommand{\paragraph}{%
  \@startsection{paragraph}{4}{\z@}%
                {1.5ex \@plus 0.5ex \@minus 0.2ex}%
                {-1em}%
                {\normalsize\bf}%
 }
 \providecommand{\subparagraph}{}
 \renewcommand{\subparagraph}{%
  \@startsection{subparagraph}{5}{\z@}%
                {1.5ex \@plus 0.5ex \@minus 0.2ex}%
                {-1em}%
                {\normalsize\bf}%
 }
 \providecommand{\subsubsubsection}{}
 \renewcommand{\subsubsubsection}{%
  \vskip5pt{\noindent\normalsize\rm\raggedright}%
 }
 % float placement
 \renewcommand{\topfraction      }{0.85}
 \renewcommand{\bottomfraction   }{0.4}
 \renewcommand{\textfraction     }{0.1}
 \renewcommand{\floatpagefraction}{0.7}
 \newlength{\@abovecaptionskip}\setlength{\@abovecaptionskip}{7\p@}
 \newlength{\@belowcaptionskip}\setlength{\@belowcaptionskip}{\z@}
 \setlength{\abovecaptionskip}{\@abovecaptionskip}
 \setlength{\belowcaptionskip}{\@belowcaptionskip}
 % swap above/belowcaptionskip lengths for tables
 \renewenvironment{table}
  {\setlength{\abovecaptionskip}{\@belowcaptionskip}%
   \setlength{\belowcaptionskip}{\@abovecaptionskip}%
   \@float{table}}
  {\end@float}
 % footnote formatting
 \setlength{\footnotesep }{6.65\p@}
 \setlength{\skip\footins}{9\p@ \@plus 4\p@ \@minus 2\p@}
 \renewcommand{\footnoterule}{\kern-3\p@ \hrule width 12pc \kern 2.6\p@}
 \setcounter{footnote}{0}
 % paragraph formatting
 \setlength{\parindent}{\z@}
 \setlength{\parskip  }{5.5\p@}
 % list formatting
 \setlength{\topsep       }{4\p@ \@plus 1\p@   \@minus 2\p@}
 \setlength{\partopsep    }{1\p@ \@plus 0.5\p@ \@minus 0.5\p@}
 \setlength{\itemsep      }{2\p@ \@plus 1\p@   \@minus 0.5\p@}
 \setlength{\parsep       }{2\p@ \@plus 1\p@   \@minus 0.5\p@}
 \setlength{\leftmargin   }{3pc}
 \setlength{\leftmargini  }{\leftmargin}
 \setlength{\leftmarginii }{2em}
 \setlength{\leftmarginiii}{1.5em}
 \setlength{\leftmarginiv }{1.0em}
 \setlength{\leftmarginv  }{0.5em}
 \def\@listi  {\leftmargin\leftmargini}
 \def\@listii {\leftmargin\leftmarginii
              \labelwidth\leftmarginii
              \advance\labelwidth-\labelsep
              \topsep  2\p@ \@plus 1\p@    \@minus 0.5\p@
              \parsep  1\p@ \@plus 0.5\p@ \@minus 0.5\p@
              \itemsep \parsep}
 \def\@listiii{\leftmargin\leftmarginiii
              \labelwidth\leftmarginiii
              \advance\labelwidth-\labelsep
              \topsep    1\p@ \@plus 0.5\p@ \@minus 0.5\p@
              \parsep    \z@
              \partopsep 0.5\p@ \@plus 0\p@ \@minus 0.5\p@
              \itemsep \topsep}
 \def\@listiv {\leftmargin\leftmarginiv
              \labelwidth\leftmarginiv
              \advance\labelwidth-\labelsep}
 \def\@listv  {\leftmargin\leftmarginv
              \labelwidth\leftmarginv
              \advance\labelwidth-\labelsep}
 \def\@listvi {\leftmargin\leftmarginvi
              \labelwidth\leftmarginvi
              \advance\labelwidth-\labelsep}
 % create title
 \providecommand{\maketitle}{}
 \renewcommand{\maketitle}{%
  \par
  \begingroup
    \renewcommand{\thefootnote}{\fnsymbol{footnote}}
    % for perfect author name centering
    %\renewcommand{\@makefnmark}{\hbox to \z@{$^{\@thefnmark}$\hss}}
    % The footnote-mark was overlapping the footnote-text,
    % added the following to fix this problem               (MK)
    \long\def\@makefntext##1{%
      \parindent 1em\noindent
      \hbox to 1.8em{\hss $\m@th ^{\@thefnmark}$}##1
    }
    \thispagestyle{empty}
    \@maketitle
    \@thanks
    %\@notice
  \endgroup
  \let\maketitle\relax
  \let\thanks\relax
 }
 % rules for title box at top of first page
 \newcommand{\@toptitlebar}{
  \hrule height 2\p@
  \vskip 0.25in
  \vskip -\parskip%
 }
 \newcommand{\@bottomtitlebar}{
  \vskip 0.29in
  \vskip -\parskip
  \hrule height 2\p@
  \vskip 0.09in%
 }
 % create title (includes both anonymized and non-anonymized versions)
 \providecommand{\@maketitle}{}
 \renewcommand{\@maketitle}{%
  \vbox{%
    \hsize\textwidth
    \linewidth\hsize
    \vskip 0.1in
    \@toptitlebar
    \centering
    {\LARGE\sc \@title\par}
    \@bottomtitlebar
    \textsc{\undertitle}\\
    \vskip 0.1in
    \def\And{%
      \end{tabular}\hfil\linebreak[0]\hfil%
      \begin{tabular}[t]{c}\bf\rule{\z@}{24\p@}\ignorespaces%
    }
    \def\AND{%
      \end{tabular}\hfil\linebreak[4]\hfil%
      \begin{tabular}[t]{c}\bf\rule{\z@}{24\p@}\ignorespaces%
    }
    \begin{tabular}[t]{c}\bf\rule{\z@}{24\p@}\@author\end{tabular}%
  \vskip 0.4in \@minus 0.1in \center{\@date}   \vskip 0.2in
  }
 }
 % add conference notice to bottom of first page
 \newcommand{\ftype@noticebox}{8}
 \newcommand{\@notice}{%
  % give a bit of extra room back to authors on first page
  \enlargethispage{2\baselineskip}%
  \@float{noticebox}[b]%
    \footnotesize\@noticestring%
  \end@float%
 }
 % abstract styling
 \renewenvironment{abstract}
 {
  \centerline
  {\large \bfseries \scshape Abstract}
  \begin{quote}
 }
 {
  \end{quote}
 }
 \endinput
--- a/arxiv-style/equations.tex
+++ b/arxiv-style/equations.tex
@@ -1,95 +0,0 @@
 \documentclass[10pt, twocolumn]{article}
 \usepackage{amsmath, amssymb}
 \usepackage{bm}
 \usepackage[margin=1in]{geometry}
 \usepackage{microtype}
 \title{Equations: Mask-DDPM Methodology}
 \author{}
 \date{}
 \begin{document}
 \maketitle
 \section{Problem Formulation}
 Each training instance is a fixed-length window of length $L$, comprising continuous channels $\bm{X} \in \mathbb{R}^{L \times d_c}$ and discrete channels $\bm{Y} = \{y^{(j)}_{1:L}\}_{j=1}^{d_d}$, where each discrete variable satisfies $y^{(j)}_t \in \mathcal{V}_j$ for a finite vocabulary $\mathcal{V}_j$.
 \section{Transformer Trend Module for Continuous Dynamics}
 We posit an additive decomposition of the continuous signal:
 \begin{equation}
 \bm{X} = \bm{S} + \bm{R},
 \label{eq:additive_decomp}
 \end{equation}
 where $\bm{S} \in \mathbb{R}^{L \times d_c}$ captures the smooth temporal trend and $\bm{R} \in \mathbb{R}^{L \times d_c}$ represents distributional residuals.
 The causal Transformer trend extractor $f_{\phi}$ predicts the next-step trend via:
 \begin{equation}
 \hat{\bm{S}}_{t+1} = f_{\phi}(\bm{X}_{1:t}), \quad t = 1, \dots, L-1.
 \label{eq:trend_prediction}
 \end{equation}
 Training minimizes the mean-squared error:
 \begin{equation}
 \mathcal{L}_{\text{trend}}(\phi) = \frac{1}{(L-1)d_c} \sum_{t=1}^{L-1} \bigl\| \hat{\bm{S}}_{t+1} - \bm{X}_{t+1} \bigr\|_2^2.
 \label{eq:trend_loss}
 \end{equation}
 At inference, the residual target is defined as $\bm{R} = \bm{X} - \hat{\bm{S}}$.
 \section{DDPM for Continuous Residual Generation}
 Let $K$ denote diffusion steps with noise schedule $\{\beta_k\}_{k=1}^K$, $\alpha_k = 1 - \beta_k$, and $\bar{\alpha}_k = \prod_{i=1}^k \alpha_i$. The forward corruption process is:
 \begin{align}
 q(\bm{r}_k \mid \bm{r}_0) &= \mathcal{N}\bigl( \sqrt{\bar{\alpha}_k}\,\bm{r}_0,\; (1 - \bar{\alpha}_k)\mathbf{I} \bigr), \\
 \bm{r}_k &= \sqrt{\bar{\alpha}_k}\,\bm{r}_0 + \sqrt{1 - \bar{\alpha}_k}\,\boldsymbol{\epsilon}, \quad \boldsymbol{\epsilon} \sim \mathcal{N}(\mathbf{0}, \mathbf{I}),
 \label{eq:forward_process}
 \end{align}
 where $\bm{r}_0 \equiv \bm{R}$.
 The reverse process is parameterized as:
 \begin{equation}
 p_{\theta}(\bm{r}_{k-1} \mid \bm{r}_k, \hat{\bm{S}}) = \mathcal{N}\bigl( \boldsymbol{\mu}_{\theta}(\bm{r}_k, k, \hat{\bm{S}}),\; \boldsymbol{\Sigma}(k) \bigr).
 \label{eq:reverse_process}
 \end{equation}
 Training employs the $\epsilon$-prediction objective:
 \begin{equation}
 \mathcal{L}_{\text{cont}}(\theta) = \mathbb{E}_{k,\bm{r}_0,\boldsymbol{\epsilon}} \left[ \bigl\| \boldsymbol{\epsilon} - \boldsymbol{\epsilon}_{\theta}(\bm{r}_k, k, \hat{\bm{S}}) \bigr\|_2^2 \right].
 \label{eq:ddpm_loss}
 \end{equation}
 Optionally, SNR-based reweighting yields:
 \begin{equation}
 \mathcal{L}^{\text{snr}}_{\text{cont}}(\theta) = \mathbb{E}_{k,\bm{r}_0,\boldsymbol{\epsilon}} \left[ w_k \bigl\| \boldsymbol{\epsilon} - \boldsymbol{\epsilon}_{\theta}(\bm{r}_k, k, \hat{\bm{S}}) \bigr\|_2^2 \right],
 \label{eq:snr_loss}
 \end{equation}
 where $w_k = \min(\mathrm{SNR}_k, \gamma) / \mathrm{SNR}_k$ and $\mathrm{SNR}_k = \bar{\alpha}_k / (1 - \bar{\alpha}_k)$. The final continuous output is reconstructed as $\hat{\bm{X}} = \hat{\bm{S}} + \hat{\bm{R}}$.
 \section{Masked Diffusion for Discrete Variables}
 For discrete channel $j$, the forward masking process follows schedule $\{m_k\}_{k=1}^K$:
 \begin{equation}
 q(y^{(j)}_k \mid y^{(j)}_0) =
 \begin{cases}
 y^{(j)}_0,      & \text{with probability } 1 - m_k, \\
 \texttt{[MASK]}, & \text{with probability } m_k,
 \end{cases}
 \label{eq:masking_process}
 \end{equation}
 applied independently across variables and timesteps.
 The denoiser $h_{\psi}$ predicts categorical distributions conditioned on continuous context:
 \begin{equation}
 p_{\psi}\bigl( y^{(j)}_0 \mid y_k, k, \hat{\bm{S}}, \hat{\bm{X}} \bigr) = h_{\psi}(y_k, k, \hat{\bm{S}}, \hat{\bm{X}}).
 \label{eq:discrete_denoising}
 \end{equation}
 Training minimizes the categorical cross-entropy:
 \begin{equation}
 \mathcal{L}_{\text{disc}}(\psi) = \mathbb{E}_{k} \left[ \frac{1}{|\mathcal{M}|} \sum_{(j,t) \in \mathcal{M}} \mathrm{CE}\bigl( h_{\psi}(y_k, k, \hat{\bm{S}}, \hat{\bm{X}})_{j,t},\; y^{(j)}_{0,t} \bigr) \right],
 \label{eq:discrete_loss}
 \end{equation}
 where $\mathcal{M}$ denotes masked positions at step $k$.
 \section{Joint Optimization}
 The combined objective balances continuous and discrete learning:
 \begin{equation}
 \mathcal{L} = \lambda \, \mathcal{L}_{\text{cont}} + (1 - \lambda) \, \mathcal{L}_{\text{disc}}, \quad \lambda \in [0,1].
 \label{eq:joint_objective}
 \end{equation}
 Type-aware routing enforces deterministic reconstruction $\hat{x}^{(i)} = g_i(\hat{\bm{X}}, \hat{\bm{Y}})$ for derived variables.
 \end{document}
--- a/arxiv-style/fig-design-v1.png
+++ b/arxiv-style/fig-design-v1.png
--- a/arxiv-style/fig-design-v2.png
+++ b/arxiv-style/fig-design-v2.png
--- a/arxiv-style/main.tex
+++ b/arxiv-style/main.tex
@@ -1,256 +0,0 @@
 \documentclass{article}
 \usepackage{arxiv}
 \usepackage[utf8]{inputenc} % allow utf-8 input
 \usepackage[T1]{fontenc}    % use 8-bit T1 fonts
 \usepackage{hyperref}       % hyperlinks
 \usepackage{url}            % simple URL typesetting
 \usepackage{booktabs}       % professional-quality tables
 \usepackage{amsfonts}       % blackboard math symbols
 \usepackage{nicefrac}       % compact symbols for 1/2, etc.
 \usepackage{microtype}      % microtypography
 \usepackage{amsmath}		% cleveref must be loaded after amsmath!
 \usepackage{cleveref}       % smart cross-referencing
 \usepackage{lipsum}         % Can be removed after putting your text content
 \usepackage{graphicx}
 \usepackage{natbib}
 \usepackage{doi}
 % Packages for equations
 \usepackage{amssymb}
 \usepackage{bm}
 % 标题
 \title{Your Paper Title: A Deep Learning Approach for Something}
 % 若不需要日期，取消下面一行的注释
 %\date{}
 \newif\ifuniqueAffiliation
 \uniqueAffiliationtrue
 \ifuniqueAffiliation % 标准作者块
 \author{
    David S.~Hippocampus \\
 	Department of Computer Science\\
 	Cranberry-Lemon University\\
 	Pittsburgh, PA 15213 \\
 	\texttt{hippo@cs.cranberry-lemon.edu} \\
 	\And
 	Elias D.~Striatum \\
 	Department of Electrical Engineering\\
 	Mount-Sheikh University\\
 	Santa Narimana, Levand \\
 	\texttt{stariate@ee.mount-sheikh.edu} \\
 	\And
 	John Q.~Doe \\
 	Department of Mathematics\\
 	University of California, Berkeley\\
 	Berkeley, CA 94720 \\
 	\texttt{johndoe@math.berkeley.edu}
 }
 \fi
 % 页眉设置
 \renewcommand{\shorttitle}{\textit{arXiv} Template}
 %%% PDF 元数据
 \hypersetup{
 pdftitle={Your Paper Title},
 pdfsubject={cs.LG, cs.CR},
 pdfauthor={David S.~Hippocampus, Elias D.~Striatum},
 pdfkeywords={Keyword1, Keyword2, Keyword3},
 }
 \begin{document}
 \maketitle
 \begin{abstract}
 	Here is the abstract of your paper.
 \end{abstract}
 % 关键词
 \keywords{Machine Learning \and Cyber Defense \and Benchmark \and Methodology}
 % 1. Introduction
 \section{Introduction}
 \label{sec:intro}
 Here introduces the background, problem statement, and contribution.
 % 2. Related Work
 \section{Related Work}
 \label{sec:related}
 Early generation of network data oriented towards ``realism'' mostly remained at the packet/flow header level, either through replay or statistical synthesis based on single-point observations. Swing, in a closed-loop, network-responsive manner, extracts user/application/network distributions from single-point observations to reproduce burstiness and correlation across multiple time scales \citep{10.1145/1151659.1159928,10.1145/1159913.1159928}. Subsequently, a series of works advanced header synthesis to learning-based generation: the WGAN-based method added explicit verification of protocol field consistency to NetFlow/IPFIX \citep{Ring_2019}, NetShare reconstructed header modeling as flow-level time series and improved fidelity and scalability through domain encoding and parallel fine-tuning \citep{10.1145/3544216.3544251}, and DoppelGANger preserved the long-range structure and downstream sorting consistency of networked time series by decoupling attributes from sequences \citep{Lin_2020}. However, in industrial control system (ICS) scenarios, the original PCAP is usually not shareable, and public testbeds (such as SWaT, WADI) mostly provide process/monitoring telemetry and protocol interactions for security assessment, but public datasets emphasize operational variables rather than packet-level traces \citep{7469060,10.1145/3055366.3055375}. This makes ``synthesis at the feature/telemetry level, aware of protocol and semantics'' more feasible and necessary in practice: we are more concerned with reproducing high-level distributions and multi-scale temporal patterns according to operational semantics and physical constraints without relying on the original packets. From this perspective, the generation paradigm naturally shifts from ``packet syntax reproduction'' to ``modeling of high-level spatio-temporal distributions and uncertainties'', requiring stable training, strong distribution fitting, and interpretable uncertainty characterization.
 Diffusion models exhibit good fit along this path: DDPM achieves high-quality sampling and stable optimization through efficient $\epsilon$ parameterization and weighted variational objectives \citep{NEURIPS2020_4c5bcfec}, the SDE perspective unifies score-based and diffusion, providing likelihood evaluation and prediction-correction sampling strategies based on probability flow ODEs \citep{song2021scorebasedgenerativemodelingstochastic}. For time series, TimeGrad replaces the constrained output distribution with conditional denoising, capturing high-dimensional correlations at each step \citep{rasul2021autoregressivedenoisingdiffusionmodels}; CSDI explicitly performs conditional diffusion and uses two-dimensional attention to simultaneously leverage temporal and cross-feature dependencies, suitable for conditioning and filling in missing values \citep{tashiro2021csdiconditionalscorebaseddiffusion}; in a more general spatio-temporal structure, DiffSTG generalizes diffusion to spatio-temporal graphs, combining TCN/GCN with denoising U-Net to improve CRPS and inference efficiency in a non-autoregressive manner \citep{wen2024diffstgprobabilisticspatiotemporalgraph}, and PriSTI further enhances conditional features and geographical relationships, maintaining robustness under high missing rates and sensor failures \citep{liu2023pristiconditionaldiffusionframework}; in long sequences and continuous domains, DiffWave verifies that diffusion can also match the quality of strong vocoders under non-autoregressive fast synthesis \citep{kong2021diffwaveversatilediffusionmodel}; studies on cellular communication traffic show that diffusion can recover spatio-temporal patterns and provide uncertainty characterization at the urban scale \citep{11087622}. These results overall point to a conclusion: when the research focus is on ``telemetry/high-level features'' rather than raw messages, diffusion models provide stable and fine-grained distribution fitting and uncertainty quantification, which is exactly in line with the requirements of ICS telemetry synthesis. Meanwhile, directly entrusting all structures to a ``monolithic diffusion'' is not advisable: long-range temporal skeletons and fine-grained marginal distributions often have optimization tensions, requiring explicit decoupling in modeling.
 Looking further into the mechanism complexity of ICS: its channel types are inherently mixed, containing both continuous process trajectories and discrete supervision/status variables, and discrete channels must be ``legal'' under operational constraints. The aforementioned progress in time series diffusion has mainly occurred in continuous spaces, but discrete diffusion has also developed systematic methods: D3PM improves sampling quality and likelihood through absorption/masking and structured transitions in discrete state spaces \citep{austin2023structureddenoisingdiffusionmodels}, subsequent masked diffusion provides stable reconstruction on categorical data in a more simplified form \citep{Lin_2020}, multinomial diffusion directly defines diffusion on a finite vocabulary through mechanisms such as argmax flows \citep{hoogeboom2021argmaxflowsmultinomialdiffusion}, and Diffusion-LM demonstrates an effective path for controllable text generation by imposing gradient constraints in continuous latent spaces \citep{li2022diffusionlmimprovescontrollabletext}. From the perspectives of protocols and finite-state machines, coverage-guided fuzz testing emphasizes the criticality of ``sequence legality and state coverage'' \citep{meng2025aflnetyearslatercoverageguided,godefroid2017learnfuzzmachinelearninginput,she2019neuzzefficientfuzzingneural}, echoing the concept of ``legality by construction'' in discrete diffusion: preferentially adopting absorption/masking diffusion on discrete channels, supplemented by type-aware conditioning and sampling constraints, to avoid semantic invalidity and marginal distortion caused by post hoc thresholding.
 From the perspective of high-level synthesis, the temporal structure is equally indispensable: ICS control often involves delay effects, phased operating conditions, and cross-channel coupling, requiring models to be able to characterize low-frequency, long-range dependencies while also overlaying multi-modal fine-grained fluctuations on them. The Transformer series has provided sufficient evidence in long-sequence time series tasks: Transformer-XL breaks through the fixed-length context limitation through a reusable memory mechanism and significantly enhances long-range dependency expression \citep{dai2019transformerxlattentivelanguagemodels}; Informer uses ProbSparse attention and efficient decoding to balance span and efficiency in long-sequence prediction \citep{zhou2021informerefficienttransformerlong}; Autoformer robustly models long-term seasonality and trends through autocorrelation and decomposition mechanisms \citep{wu2022autoformerdecompositiontransformersautocorrelation}; FEDformer further improves long-period prediction performance in frequency domain enhancement and decomposition \citep{zhou2022fedformerfrequencyenhanceddecomposed}; PatchTST enhances the stability and generalization of long-sequence multivariate prediction through local patch-based representation and channel-independent modeling \citep{2023}. Combining our previous positioning of diffusion, this chain of evidence points to a natural division of labor: using attention-based sequence models to first extract stable low-frequency trends/conditions (long-range skeletons), and then allowing diffusion to focus on margins and details in the residual space; meanwhile, discrete masking/absorbing diffusion is applied to supervised/pattern variables to ensure vocabulary legality by construction. This design not only inherits the advantages of time series diffusion in distribution fitting and uncertainty characterization \citep{rasul2021autoregressivedenoisingdiffusionmodels,tashiro2021csdiconditionalscorebaseddiffusion,wen2024diffstgprobabilisticspatiotemporalgraph,liu2023pristiconditionaldiffusionframework,kong2021diffwaveversatilediffusionmodel,11087622}, but also stabilizes the macroscopic temporal support through the long-range attention of Transformer, enabling the formation of an operational integrated generation pipeline under the mixed types and multi-scale dynamics of ICS.
 % 3. Methodology
 \section{Methodology}
 \label{sec:method}
 Industrial control system (ICS) telemetry is intrinsically mixed-type and mechanistically heterogeneous: continuous process trajectories (e.g., sensor and actuator signals) coexist with discrete supervisory states (e.g., modes, alarms, interlocks), and the underlying generating mechanisms range from physical inertia to program-driven step logic. This heterogeneity is not cosmetic—it directly affects what “realistic” synthesis means, because a generator must jointly satisfy (i) temporal coherence, (ii) distributional fidelity, and (iii) discrete semantic validity (i.e., every discrete output must belong to its legal vocabulary by construction). These properties are emphasized broadly in operational-technology security guidance and ICS engineering practice, where state logic and physical dynamics are tightly coupled \citep{nist2023sp80082}.
 We formalize each training instance as a fixed-length window of length We model each training instance as a fixed-length window of length $L$, comprising continuous channels $\bm{X} \in \mathbb{R}^{L \times d_c}$ and discrete channels $\bm{Y} = \{y^{(j)}_{1:L}\}_{j=1}^{d_d}$, where each discrete variable satisfies $y^{(j)}_t \in \mathcal{V}_j$ for a finite vocabulary $\mathcal{V}_j$. Our objective is to learn a generator that produces synthetic $(\hat{\bm{X}}, \hat{\bm{Y}})$ that are simultaneously coherent and distributionally faithful, while also ensuring $\hat{y}^{(j)}_t\in\mathcal{V}_j$ for all $j$, $t$ by construction (rather than via post-hoc rounding or thresholding).
 A key empirical and methodological tension in ICS synthesis is that temporal realism and marginal/distributional realism can compete when optimized monolithically: sequence models trained primarily for regression often over-smooth heavy tails and intermittent bursts, while purely distribution-matching objectives can erode long-range structure. Diffusion models provide a principled route to rich distribution modeling through iterative denoising, but they do not, by themselves, resolve (i) the need for a stable low-frequency temporal scaffold, nor (ii) the discrete legality constraints for supervisory variables \citep{ho2020denoising,song2021score}. Recent time-series diffusion work further suggests that separating coarse structure from stochastic refinement can be an effective inductive bias for long-horizon realism \citep{kollovieh2023tsdiff,sikder2023transfusion}.
 \begin{figure}[htbp]
  \centering
  \includegraphics[width=0.8\textwidth]{fig-design-v2.png}
  % \caption{Description of the figure.}
  \label{fig:design}
 \end{figure}
 Motivated by these considerations, we propose Mask-DDPM, organized in the following order:
 \begin{enumerate}
  \item Transformer trend module: learns the dominant temporal backbone of continuous dynamics via attention-based sequence modeling \citep{vaswani2017attention}. 
  \item Residual DDPM for continuous variables: models distributional detail as stochastic residual structure conditioned on the learned trend \citep{ho2020denoising,kollovieh2023tsdiff}.
  \item Masked diffusion for discrete variables: generates discrete ICS states with an absorbing/masking corruption process and categorical reconstruction \citep{austin2021structured, shi2024simplified}.
  \item Type-aware decomposition: a type-aware factorization and routing layer that assigns variables to the most appropriate modeling mechanism and enforces deterministic constraints where warranted.
 \end{enumerate}
 This ordering is intentional. The trend module establishes a macro-temporal scaffold; residual diffusion then concentrates capacity on micro-structure and marginal fidelity; masked diffusion provides a native mechanism for discrete legality; and the type-aware layer operationalizes the observation that not all ICS variables should be modeled with the same stochastic mechanism. Importantly, while diffusion-based generation for ICS telemetry has begun to emerge, existing approaches remain limited and typically emphasize continuous synthesis or augmentation; in contrast, our pipeline integrates (i) a Transformer-conditioned residual diffusion backbone, (ii) a discrete masked-diffusion branch, and (iii) explicit type-aware routing for heterogeneous variable mechanisms within a single coherent generator \citep{yuan2025ctu,sha2026ddpm}.
 \subsection{Transformer trend module for continuous dynamics}
 \label{sec:method-trans}
 We instantiate the temporal backbone as a causal Transformer trend extractor, leveraging self-attention’s ability to represent long-range dependencies and cross-channel interactions without recurrence \citep{vaswani2017attention}. Compared with recurrent trend extractors (e.g., GRU-style backbones), a Transformer trend module offers a direct mechanism to model delayed effects and multivariate coupling—common in ICS, where control actions may influence downstream sensors with nontrivial lags and regime-dependent propagation \citep{vaswani2017attention,nist2023sp80082}. Crucially, in our design the Transformer is not asked to be the entire generator; instead, it serves a deliberately restricted role: providing a stable, temporally coherent conditioning signal that later stochastic components refine.
 For continuous channels $\bm{X}$, we posit an additive decomposition:
 \begin{equation}
 \bm{X} = \bm{S} + \bm{R},
 \label{eq:additive_decomp}
 \end{equation}
 where $\bm{S} \in \mathbb{R}^{L \times d_c}$ is a smooth trend capturing predictable temporal evolution, and $\bm{R} \in \mathbb{R}^{L \times d_c}$ is a residual capturing distributional detail (e.g., bursts, heavy tails, local fluctuations) that is difficult to represent robustly with a purely regression-based temporal objective. This separation reflects an explicit division of labor: the trend module prioritizes temporal coherence, while diffusion (introduced next) targets distributional realism at the residual level—a strategy aligned with “predict-then-refine” perspectives in time-series diffusion modeling \citep{kollovieh2023tsdiff,sikder2023transfusion}.
 We parameterize the trend $\bm{S}$ using a causal Transformer $f_\phi$. With teacher forcing, we train $F_\phi$ to predict the next-step trend from past observations:
 \begin{equation}
 \hat{\bm{S}}_{t+1} = f_{\phi}(\bm{X}_{1:t}), \quad t = 1, \dots, L-1.
 \label{eq:trend_prediction}
 \end{equation}
 using the mean-squared error objective:
 \begin{equation}
 \mathcal{L}_{\text{trend}}(\phi) = \frac{1}{(L-1)d_c} \sum_{t=1}^{L-1} \bigl\| \hat{\bm{S}}_{t+1} - \bm{X}_{t+1} \bigr\|_2^2.
 \label{eq:trend_loss}
 \end{equation}
 At inference, we roll out the Transformer autoregressively to obtain $\hat{\bm{S}}$, and and then define the residual target for diffusion as $\bm{R} = \bm{X} - \hat{\bm{S}}$. This setup intentionally “locks in” a coherent low-frequency scaffold before any stochastic refinement is applied, thereby reducing the burden on downstream diffusion modules to simultaneously learn both long-range structure and marginal detail. In this sense, our use of Transformers is distinctive: it is a conditioning-first temporal backbone designed to stabilize mixed-type diffusion synthesis in ICS, rather than an end-to-end monolithic generator \citep{vaswani2017attention,kollovieh2023tsdiff,yuan2025ctu}.
 \subsection{DDPM for continuous residual generation}
 \label{sec:method-ddpm}
 We model the residual RRR with a denoising diffusion probabilistic model (DDPM) conditioned on the trend $\hat{\bm{S}}$ \citep{ho2020denoising}. Diffusion models learn complex data distributions by inverting a tractable noising process through iterative denoising, and have proven effective at capturing multimodality and heavy-tailed structure that is often attenuated by purely regression-based sequence models \citep{ho2020denoising,song2021score}. Conditioning the diffusion model on $\hat{\bm{S}}$ is central: it prevents the denoiser from re-learning the low-frequency scaffold and focuses capacity on residual micro-structure, mirroring the broader principle that diffusion excels as a distributional corrector when a reasonable coarse structure is available \citep{kollovieh2023tsdiff, sikder2023transfusion}.
 Let $\bm{K}$ denote the number of diffusion steps, with a noise schedule $\{\beta_k\}_{k=1}^K$, $\alpha_k = 1 - \beta_k$, and $\bar{\alpha}_k = \prod_{i=1}^k \alpha_i$. The forward corruption process is:
 \begin{equation}
 q(\bm{r}_k \mid \bm{r}_0) &= \mathcal{N}\bigl( \sqrt{\bar{\alpha}_k}\,\bm{r}_0,\; (1 - \bar{\alpha}_k)\mathbf{I} \bigr)
 \label{eq:forward_corruption}
 \end{equation}
 equivalently,
 \begin{equation}
 \bm{r}_k &= \sqrt{\bar{\alpha}_k}\,\bm{r}_0 + \sqrt{1 - \bar{\alpha}_k}\,\boldsymbol{\epsilon}, \quad \boldsymbol{\epsilon} \sim \mathcal{N}(\mathbf{0}, \mathbf{I})
 \label{eq:forward_corruption_eq}
 \end{equation}
 The learned reverse process is parameterized as:
 \begin{equation}
 p_{\theta}(\bm{r}_{k-1} \mid \bm{r}_k, \hat{\bm{S}}) = \mathcal{N}\bigl( \boldsymbol{\mu}_{\theta}(\bm{r}_k, k, \hat{\bm{S}}),\; \boldsymbol{\Sigma}(k) \bigr).
 \label{eq:reverse_process}
 \end{equation}
 where $\mu_\theta$ is implemented by a Transformer denoiser that consumes (i) the noised residual $r_k$, (ii) a timestep embedding for $k$, and (iii) conditioning features derived from $\hat{\bm{S}}$.  This denoiser architecture is consistent with the growing use of attention-based denoisers for long-context time-series diffusion, while our key methodological emphasis is the trend-conditioned residual factorization as the object of diffusion learning \citep{ho2020denoising,sikder2023transfusion}.
 We train the denoiser using the standard DDPM $\epsilon$-prediction objective:
 \begin{equation}
 \mathcal{L}_{\text{cont}}(\theta) = \mathbb{E}_{k,\bm{r}_0,\boldsymbol{\epsilon}} \left[ \bigl\| \boldsymbol{\epsilon} - \boldsymbol{\epsilon}_{\theta}(\bm{r}_k, k, \hat{\bm{S}}) \bigr\|_2^2 \right].
 \label{eq:ddpm_loss}
 \end{equation}
 Because diffusion optimization can exhibit timestep imbalance (i.e., some timesteps dominate gradients), we optionally apply an SNR-based reweighting consistent with Min-SNR training:
 \begin{equation}
 \mathcal{L}^{\text{snr}}_{\text{cont}}(\theta) = \mathbb{E}_{k,\bm{r}_0,\boldsymbol{\epsilon}} \left[ w_k \bigl\| \boldsymbol{\epsilon} - \boldsymbol{\epsilon}_{\theta}(\bm{r}_k, k, \hat{\bm{S}}) \bigr\|_2^2 \right],
 \label{eq:snr_loss}
 \end{equation}
 where $\mathrm{SNR}_k=\bar{\alpha}_k/(1-\bar{\alpha}_k)$ and $\gamma>0$ is a cap parameter \citep{hang2023efficient}.
 After sampling $\hat{\bm{R}}$ by reverse diffusion, we reconstruct the continuous output as $\hat{\bm{X}} = \hat{\bm{S}} + \hat{\bm{R}}$. Overall, the DDPM component serves as a distributional corrector on top of a temporally coherent backbone, which is particularly suited to ICS where low-frequency dynamics are strong and persistent but fine-scale variability (including bursts and regime-conditioned noise) remains important for realism. Relative to prior ICS diffusion efforts that primarily focus on continuous augmentation, our formulation elevates trend-conditioned residual diffusion as a modular mechanism for disentangling temporal structure from distributional refinement \citep{yuan2025ctu,sha2026ddpm}.
 \subsection{Masked diffusion for discrete ICS variables}
 \label{sec:method-discrete}
 Discrete ICS variables must remain categorical, making Gaussian diffusion inappropriate for supervisory states and mode-like channels. While one can attempt continuous relaxations or post-hoc discretization, such strategies risk producing semantically invalid intermediate states (e.g., “in-between” modes) and can distort the discrete marginal distribution. Discrete-state diffusion provides a principled alternative by defining a valid corruption process directly on categorical variables \citep{austin2021structured,shi2024simplified}. In the ICS setting, this is not a secondary detail: supervisory tags often encode control logic boundaries (modes, alarms, interlocks) that must remain within a finite vocabulary to preserve semantic correctness \citep{nist2023sp80082}.
 We therefore adopt masked (absorbing) diffusion for discrete channels, where corruption replaces tokens with a special $\texttt{[MASK]}$ symbol according to a schedule \citep{shi2024simplified}. For each variable $j$, define a masking schedule $\{m_k\}_{k=1}^K$ (with $m_k\in[0,1]$) increasing in $k$. The forward corruption process is:
 \begin{equation}
 q(y^{(j)}_k \mid y^{(j)}_0) =
 \begin{cases}
 y^{(j)}_0,            & \text{with probability } 1 - m_k, \\
 \texttt{[MASK]},      & \text{with probability } m_k,
 \end{cases}
 \label{eq:masking_process}
 \end{equation}
 applied independently across $j$ and $t$. Let $\mathcal{M}$ denote the set of masked positions at step $k$. The denoiser $h_{\psi}$ predicts a categorical distribution over $\mathcal{V}_j$ for each masked token, conditioned on (i) the corrupted discrete sequence, (ii) the diffusion step $k$, and (iii) continuous context. Concretely, we condition on $\hat{\bm{S}}$ and $\hat{\bm{X}}$ to couple supervisory reconstruction to the underlying continuous dynamics:
 \begin{equation}
 p_{\psi}\bigl( y^{(j)}_0 \mid y_k, k, \hat{\bm{S}}, \hat{\bm{X}} \bigr) = h_{\psi}(y_k, k, \hat{\bm{S}}, \hat{\bm{X}}).
 \label{eq:discrete_denoising}
 \end{equation}
 This conditioning choice is motivated by the fact that many discrete ICS states are not standalone, they are functions of regimes, thresholds, and procedural phases that manifest in continuous channels \citep{nist2023sp80082}. Training uses a categorical denoising objective:
 \begin{equation}
 \mathcal{L}_{\text{disc}}(\psi) = \mathbb{E}_{k} \left[ \frac{1}{|\mathcal{M}|} \sum_{(j,t) \in \mathcal{M}} \mathrm{CE}\bigl( h_{\psi}(y_k, k, \hat{\bm{S}}, \hat{\bm{X}})_{j,t},\; y^{(j)}_{0,t} \bigr) \right],
 \label{eq:discrete_loss}
 \end{equation}
 where $\mathrm{CE}(\cdot,\cdot)$ is cross-entropy. At sampling time, we initialize all discrete tokens as $\texttt{[MASK]}$ and iteratively unmask them using the learned conditionals, ensuring that every output token lies in its legal vocabulary by construction. This discrete branch is a key differentiator of our pipeline: unlike typical continuous-only diffusion augmentation in ICS, we integrate masked diffusion as a first-class mechanism for supervisory-variable legality within the same end-to-end synthesis workflow \citep{shi2024simplified,yuan2025ctu}.
 \subsection{Type-aware decomposition as factorization and routing layer}
 \label{sec:method-types}
 Even with a trend-conditioned residual DDPM and a discrete masked-diffusion branch, a single uniform modeling treatment can remain suboptimal because ICS variables are generated by qualitatively different mechanisms. For example, program-driven setpoints exhibit step-and-dwell dynamics; controller outputs follow control laws conditioned on process feedback; actuator positions may show saturation and dwell; and some “derived tags” are deterministic functions of other channels. Treating all channels as if they were exchangeable stochastic processes can misallocate model capacity and induce systematic error concentration on a small subset of mechanistically distinct variables \citep{nist2023sp80082}.
 We therefore introduce a type-aware decomposition that formalizes this heterogeneity as a routing and constraint layer.  Let $\tau(i)\in{1,\dots,6}$ assign each variable (i) to a type class. The type assignment can be initialized from domain semantics (tag metadata, value domains, and engineering meaning), and subsequently refined via an error-attribution workflow described in the Benchmark section. Importantly, this refinement does not change the core diffusion backbone; it changes which mechanism is responsible for which variable, thereby aligning inductive bias with variable-generating mechanism while preserving overall coherence.
 We use the following taxonomy:
 \begin{enumerate}
 	\item Type 1 (program-driven / setpoint-like): externally commanded, step-and-dwell variables. These variables can be treated as exogenous drivers (conditioning signals) or routed to specialized change-point / dwell-time models, rather than being forced into a smooth denoiser that may over-regularize step structure. 
 	\item Type 2 (controller outputs): continuous variables tightly coupled to feedback loops; these benefit from conditional modeling where the conditioning includes relevant process variables and commanded setpoints. 
 	\item Type 3 (actuator states/positions): often exhibit saturation, dwell, and rate limits; these may require stateful dynamics beyond generic residual diffusion, motivating either specialized conditional modules or additional inductive constraints. 
 	\item Type 4 (process variables): inertia-dominated continuous dynamics; these are the primary beneficiaries of the Transformer trend + residual DDPM pipeline. 
 	\item Type 5 (derived/deterministic variables): algebraic or rule-based functions of other variables; we enforce deterministic reconstruction $\hat{x}^{(i)} = g_i(\hat{X},\hat{Y})$ rather than learning a stochastic generator, improving logical consistency and sample efficiency.
 	\item Type 6 (auxiliary/low-impact variables): weakly coupled or sparse signals; we allow simplified modeling (e.g., calibrated marginals or lightweight temporal models) to avoid allocating diffusion capacity where it is not warranted. 
 \end{enumerate}
 Type-aware decomposition improves synthesis quality through three mechanisms. First, it improves capacity allocation by preventing a small set of mechanistically atypical variables from dominating gradients and distorting the learned distribution for the majority class (typically Type 4). Second, it enables constraint enforcement by deterministically reconstructing Type 5 variables, preventing logically inconsistent samples that purely learned generators can produce. Third, it improves mechanism alignment by attaching inductive biases consistent with step/dwell or saturation behaviors where generic denoisers may implicitly favor smoothness.
 From a novelty standpoint, this layer is not merely an engineering “patch”; it is an explicit methodological statement that ICS synthesis benefits from typed factorization—a principle that has analogues in mixed-type generative modeling more broadly, but that remains underexplored in diffusion-based ICS telemetry synthesis \citep{shi2025tabdiff,yuan2025ctu,nist2023sp80082}.
 \subsection{Joint optimization and end-to-end sampling}
 \label{sec:method-joint}
 We train the model in a staged manner consistent with the above factorization, which improves optimization stability and encourages each component to specialize in its intended role. Specifically: (i) we train the trend Transformer $f_{\phi}$ to obtain $\hat{\bm{S}}$; (ii) we compute residual targets $\hat{\bm{R}} = \bm{X} - \hat{\bm{S}}$ for the continuous variables routed to residual diffusion; (iii) we train the residual DDPM $p_{\theta}(\bm{R}\mid \hat{\bm{S}})$ and masked diffusion model $p_{\psi}(\bm{Y}\mid \text{masked}(\bm{Y}), \hat{\bm{S}}, \hat{\bm{X}})$; and (iv) we apply type-aware routing and deterministic reconstruction during sampling. This staged strategy is aligned with the design goal of separating temporal scaffolding from distributional refinement, and it mirrors the broader intuition in time-series diffusion that decoupling coarse structure and stochastic detail can mitigate “structure vs. realism” conflicts \citep{kollovieh2023tsdiff,sikder2023transfusion}.
 A simple combined objective is $\mathcal{L} = \lambda\mathcal{L}_{\text{cont}} + (1-\lambda)\mathcal{L}_{\text{disc}}$ with $\lambda\in[0,1]$ controlling the balance between continuous and discrete learning. Type-aware routing determines which channels contribute to which loss and which are excluded in favor of deterministic reconstruction. In practice, this routing acts as a principled guardrail against negative transfer across variable mechanisms: channels that are best handled deterministically (Type 5) or by specialized drivers (Type 1/3, depending on configuration) are prevented from forcing the diffusion models into statistically incoherent compromises.
 At inference time, generation follows the same structured order: (i) trend $\hat{\bm{S}}$ via the Transformer, (ii) residual $\hat{\bm{R}}$ via DDPM, (iii) discrete $\hat{\bm{Y}}$ via masked diffusion, and (iv) type-aware assembly with deterministic reconstruction for routed variables. This pipeline produces $(\hat{\bm{X}},\hat{\bm{Y}})$ that are temporally coherent by construction (through $\hat{\bm{S}}$), distributionally expressive (through $\hat{\bm{R}}$ denoising), and discretely valid (through masked diffusion), while explicitly accounting for heterogeneous variable-generating mechanisms through type-aware routing. In combination, these choices constitute our central methodological contribution: a unified Transformer + mixed diffusion generator for ICS telemetry, augmented by typed factorization to align model capacity with domain mechanism \citep{ho2020denoising,shi2024simplified,yuan2025ctu,nist2023sp80082}.
 % 4. Benchmark
 \section{Benchmark}
 \label{sec:benchmark}
 In this section, we present the experimental setup and results.
 % 5. Future Work
 \section{Future Work}
 \label{sec:future}
 In this section, we present the future work.
 % 6. Conclusion
 \section{Conclusion}
 \label{sec:conclusion}
 In this section, we summarize our contributions and future directions.
 % 参考文献
 \bibliographystyle{unsrtnat}
 \bibliography{references}
 \end{document}
--- a/arxiv-style/references.bib
+++ b/arxiv-style/references.bib
@@ -1,421 +0,0 @@
@inproceedings{vaswani2017attention,
  title={Attention Is All You Need},
  author={Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser, {\L}ukasz and Polosukhin, Illia},
  booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
  volume={30},
  year={2017},
  url={https://arxiv.org/abs/1706.03762}
 }
@inproceedings{ho2020denoising,
  title={Denoising Diffusion Probabilistic Models},
  author={Ho, Jonathan and Jain, Ajay and Abbeel, Pieter},
  booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
  volume={33},
  pages={6840--6851},
  year={2020},
  url={https://arxiv.org/abs/2006.11239}
 }
@inproceedings{austin2021structured,
  title={Structured Denoising Diffusion Models in Discrete State-Spaces},
  author={Austin, Jacob and Johnson, Daniel D and Ho, Jonathan and Tarlow, Daniel and van den Berg, Rianne},
  booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
  volume={34},
  pages={17981--17993},
  year={2021},
  url={https://arxiv.org/abs/2107.03006}
 }
@article{shi2024simplified,
  title={Simplified and Generalized Masked Diffusion for Discrete Data},
  author={Shi, Juntong and Han, Ke and Wang, Zinan and Doucet, Arnaud and Titsias, Michalis K},
  journal={arXiv preprint},
  eprint={2406.04329},
  archivePrefix={arXiv},
  year={2024},
  url={https://arxiv.org/abs/2406.04329}
 }
@inproceedings{hang2023efficient,
  title={Efficient Diffusion Training via Min-SNR Weighting Strategy},
  author={Hang, Tianyu and Gu, Shuyang and Li, Chen and Bao, Jianmin and Chen, Dong and Hu, Han and Geng, Xin and Guo, Boxin},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  pages={7407--7417},
  year={2023},
  doi={10.1109/ICCV51070.2023.00702},
  url={https://arxiv.org/abs/2303.09556}
 }
@inproceedings{kollovieh2023tsdiff,
  title={Predict, Refine, Synthesize: Self-Guiding Diffusion Models for Probabilistic Time Series Forecasting},
  author={Kollovieh, Marcel and Ansari, Abdul Fatir and Bohlke-Schneider, Michael and Fatir Ansari, Abdul and Salinas, David},
  booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
  volume={36},
  year={2023},
  url={https://arxiv.org/abs/2307.11494}
 }
@article{sikder2023transfusion,
  title={TransFusion: Generating Long, High Fidelity Time Series using Diffusion Models with Transformers},
  author={Sikder, M. F. and Ramachandranpillai, R. and Heintz, F.},
  journal={arXiv preprint},
  eprint={2307.12667},
  archivePrefix={arXiv},
  year={2023},
  url={https://arxiv.org/abs/2307.12667}
 }
@inproceedings{song2021score,
  title={Score-Based Generative Modeling through Stochastic Differential Equations},
  author={Song, Yang and Sohl-Dickstein, Jascha and Kingma, Diederik P and Kumar, Abhishek and Ermon, Stefano and Poole, Ben},
  booktitle={International Conference on Learning Representations (ICLR)},
  year={2021},
  url={https://arxiv.org/abs/2011.13456}
 }
@inproceedings{shi2025tabdiff,
  title={TabDiff: A Mixed-type Diffusion Model for Tabular Data Generation},
  author={Shi, Juntong and Xu, Minkai and Hua, Harper and Zhang, Hengrui and Ermon, Stefano and Leskovec, Jure},
  booktitle={International Conference on Learning Representations (ICLR)},
  year={2025},
  url={https://arxiv.org/abs/2410.20626}
 }
@inproceedings{yuan2025ctu,
  title={CTU-DDPM: Generating Industrial Control System Time-Series Data with a CNN-Transformer Hybrid Diffusion Model},
  author={Yuan, Yusong and Sha, Yun and Zhao, Wei and Zhang, Kun},
  booktitle={Proceedings of the 2025 International Symposium on Artificial Intelligence and Computational Social Sciences (ACM AICSS)},
  pages={123--132},
  year={2025},
  doi={10.1145/3776759.3776845},
  url={https://dl.acm.org/doi/10.1145/3776759.3776845}
 }
@misc{sha2026ddpm,
  title={DDPM Fusing Mamba and Adaptive Attention: An Augmentation Method for Industrial Control Systems Anomaly Data},
  author={Sha, Yun and Yuan, Yusong and Wu, Yonghao and Zhao, Haidong},
  year={2026},
  month={jan},
  note={SSRN Electronic Journal},
  eprint={6055903},
  archivePrefix={SSRN},
  doi={10.2139/ssrn.6055903},
  url={https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6055903}
 }
@techreport{nist2023sp80082,
  title={Guide to Operational Technology (OT) Security},
  author={{National Institute of Standards and Technology}},
  institution={NIST},
  type={Special Publication},
  number={800-82 Rev. 3},
  year={2023},
  month={sep},
  doi={10.6028/NIST.SP.800-82r3},
  url={https://csrc.nist.gov/pubs/sp/800/82/r3/final}
 }
@article{10.1145/1151659.1159928,
 author = {Vishwanath, Kashi Venkatesh and Vahdat, Amin},
 title = {Realistic and responsive network traffic generation},
 year = {2006},
 issue_date = {October 2006},
 publisher = {Association for Computing Machinery},
 address = {New York, NY, USA},
 volume = {36},
 number = {4},
 issn = {0146-4833},
 url = {https://doi.org/10.1145/1151659.1159928},
 doi = {10.1145/1151659.1159928},
 abstract = {This paper presents Swing, a closed-loop, network-responsive traffic generator that accurately captures the packet interactions of a range of applications using a simple structural model. Starting from observed traffic at a single point in the network, Swing automatically extracts distributions for user, application, and network behavior. It then generates live traffic corresponding to the underlying models in a network emulation environment running commodity network protocol stacks. We find that the generated traces are statistically similar to the original traces. Further, to the best of our knowledge, we are the first to reproduce burstiness in traffic across a range of timescales using a model applicable to a variety of network settings. An initial sensitivity analysis reveals the importance of capturing and recreating user, application, and network characteristics to accurately reproduce such burstiness. Finally, we explore Swing's ability to vary user characteristics, application properties, and wide-area network conditions to project traffic characteristics into alternate scenarios.},
 journal = {SIGCOMM Comput. Commun. Rev.},
 month = aug,
 pages = {111–122},
 numpages = {12},
 keywords = {burstiness, energy plot, generator, internet, modeling, structural model, traffic, wavelets}
 }
@inproceedings{10.1145/1159913.1159928,
 author = {Vishwanath, Kashi Venkatesh and Vahdat, Amin},
 title = {Realistic and responsive network traffic generation},
 year = {2006},
 isbn = {1595933085},
 publisher = {Association for Computing Machinery},
 address = {New York, NY, USA},
 url = {https://doi.org/10.1145/1159913.1159928},
 doi = {10.1145/1159913.1159928},
 abstract = {This paper presents Swing, a closed-loop, network-responsive traffic generator that accurately captures the packet interactions of a range of applications using a simple structural model. Starting from observed traffic at a single point in the network, Swing automatically extracts distributions for user, application, and network behavior. It then generates live traffic corresponding to the underlying models in a network emulation environment running commodity network protocol stacks. We find that the generated traces are statistically similar to the original traces. Further, to the best of our knowledge, we are the first to reproduce burstiness in traffic across a range of timescales using a model applicable to a variety of network settings. An initial sensitivity analysis reveals the importance of capturing and recreating user, application, and network characteristics to accurately reproduce such burstiness. Finally, we explore Swing's ability to vary user characteristics, application properties, and wide-area network conditions to project traffic characteristics into alternate scenarios.},
 booktitle = {Proceedings of the 2006 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications},
 pages = {111–122},
 numpages = {12},
 keywords = {burstiness, energy plot, generator, internet, modeling, structural model, traffic, wavelets},
 location = {Pisa, Italy},
 series = {SIGCOMM '06}
 }
@article{Ring_2019,
   title={Flow-based network traffic generation using Generative Adversarial Networks},
   volume={82},
   ISSN={0167-4048},
   url={http://dx.doi.org/10.1016/j.cose.2018.12.012},
   DOI={10.1016/j.cose.2018.12.012},
   journal={Computers &amp; Security},
   publisher={Elsevier BV},
   author={Ring, Markus and Schlör, Daniel and Landes, Dieter and Hotho, Andreas},
   year={2019},
   month=may, pages={156–172} }
   @inproceedings{10.1145/3544216.3544251,
   author = {Yin, Yucheng and Lin, Zinan and Jin, Minhao and Fanti, Giulia and Sekar, Vyas},
   title = {Practical GAN-based synthetic IP header trace generation using NetShare},
   year = {2022},
   isbn = {9781450394208},
   publisher = {Association for Computing Machinery},
   address = {New York, NY, USA},
   url = {https://doi.org/10.1145/3544216.3544251},
   doi = {10.1145/3544216.3544251},
   abstract = {We explore the feasibility of using Generative Adversarial Networks (GANs) to automatically learn generative models to generate synthetic packet- and flow header traces for networking tasks (e.g., telemetry, anomaly detection, provisioning). We identify key fidelity, scalability, and privacy challenges and tradeoffs in existing GAN-based approaches. By synthesizing domain-specific insights with recent advances in machine learning and privacy, we identify design choices to tackle these challenges. Building on these insights, we develop an end-to-end framework, NetShare. We evaluate NetShare on six diverse packet header traces and find that: (1) across all distributional metrics and traces, it achieves 46\% more accuracy than baselines and (2) it meets users' requirements of downstream tasks in evaluating accuracy and rank ordering of candidate approaches.},
   booktitle = {Proceedings of the ACM SIGCOMM 2022 Conference},
   pages = {458–472},
   numpages = {15},
   keywords = {synthetic data generation, privacy, network packets, network flows, generative adversarial networks},
   location = {Amsterdam, Netherlands},
   series = {SIGCOMM '22}
   }
@inproceedings{Lin_2020, series={IMC ’20},
    title={Using GANs for Sharing Networked Time Series Data: Challenges, Initial Promise, and Open Questions},
    url={http://dx.doi.org/10.1145/3419394.3423643},
    DOI={10.1145/3419394.3423643},
    booktitle={Proceedings of the ACM Internet Measurement Conference},
    publisher={ACM},
    author={Lin, Zinan and Jain, Alankar and Wang, Chen and Fanti, Giulia and Sekar, Vyas},
    year={2020},
    month=oct, pages={464–483},
    collection={IMC ’20} }
@INPROCEEDINGS{7469060,
    author={Mathur, Aditya P. and Tippenhauer, Nils Ole},
    booktitle={2016 International Workshop on Cyber-physical Systems for Smart Water Networks (CySWater)},
    title={SWaT: a water treatment testbed for research and training on ICS security},
    year={2016},
    volume={},
    number={},
    pages={31-36},
    keywords={Sensors;Actuators;Feeds;Process control;Chemicals;Chemical sensors;Security;Cyber Physical Systems;Industrial Control Systems;Cyber Attacks;Cyber Defense;Water Testbed},
    doi={10.1109/CySWater.2016.7469060}}
@inproceedings{10.1145/3055366.3055375,
 author = {Ahmed, Chuadhry Mujeeb and Palleti, Venkata Reddy and Mathur, Aditya P.},
 title = {WADI: a water distribution testbed for research in the design of secure cyber physical systems},
 year = {2017},
 isbn = {9781450349758},
 publisher = {Association for Computing Machinery},
 address = {New York, NY, USA},
 url = {https://doi.org/10.1145/3055366.3055375},
 doi = {10.1145/3055366.3055375},
 abstract = {The architecture of a water distribution testbed (WADI), and on-going research in the design of secure water distribution system is presented. WADI consists of three stages controlled by Programmable Logic Controllers (PLCs) and two stages controlled via Remote Terminal Units (RTUs). Each PLC and RTU uses sensors to estimate the system state and the actuators to effect control. WADI is currently used to (a) conduct security analysis for water distribution networks, (b) experimentally assess detection mechanisms for potential cyber and physical attacks, and (c) understand how the impact of an attack on one CPS could cascade to other connected CPSs. The cascading effects of attacks can be studied in WADI through its connection to two other testbeds, namely for water treatment and power generation and distribution.},
 booktitle = {Proceedings of the 3rd International Workshop on Cyber-Physical Systems for Smart Water Networks},
 pages = {25–28},
 numpages = {4},
 keywords = {attack detection, cyber physical systems, cyber security, industrial control systems, water distribution testbed},
 location = {Pittsburgh, Pennsylvania},
 series = {CySWATER '17}
 }
@inproceedings{NEURIPS2020_4c5bcfec,
 author = {Ho, Jonathan and Jain, Ajay and Abbeel, Pieter},
 booktitle = {Advances in Neural Information Processing Systems},
 editor = {H. Larochelle and M. Ranzato and R. Hadsell and M.F. Balcan and H. Lin},
 pages = {6840--6851},
 publisher = {Curran Associates, Inc.},
 title = {Denoising Diffusion Probabilistic Models},
 url = {https://proceedings.neurips.cc/paper_files/paper/2020/file/4c5bcfec8584af0d967f1ab10179ca4b-Paper.pdf},
 volume = {33},
 year = {2020}
 }
@misc{song2021scorebasedgenerativemodelingstochastic,
      title={Score-Based Generative Modeling through Stochastic Differential Equations},
      author={Yang Song and Jascha Sohl-Dickstein and Diederik P. Kingma and Abhishek Kumar and Stefano Ermon and Ben Poole},
      year={2021},
      eprint={2011.13456},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2011.13456},
 }
@misc{rasul2021autoregressivedenoisingdiffusionmodels,
      title={Autoregressive Denoising Diffusion Models for Multivariate Probabilistic Time Series Forecasting},
      author={Kashif Rasul and Calvin Seward and Ingmar Schuster and Roland Vollgraf},
      year={2021},
      eprint={2101.12072},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2101.12072},
 }
@misc{tashiro2021csdiconditionalscorebaseddiffusion,
      title={CSDI Conditional Score-based Diffusion Models for Probabilistic Time Series Imputation},
      author={Yusuke Tashiro and Jiaming Song and Yang Song and Stefano Ermon},
      year={2021},
      eprint={2107.03502},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={httpsarxiv.orgabs2107.03502},
 }
@misc{wen2024diffstgprobabilisticspatiotemporalgraph,
      title={DiffSTG: Probabilistic Spatio-Temporal Graph Forecasting with Denoising Diffusion Models},
      author={Haomin Wen and Youfang Lin and Yutong Xia and Huaiyu Wan and Qingsong Wen and Roger Zimmermann and Yuxuan Liang},
      year={2024},
      eprint={2301.13629},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2301.13629},
 }
@misc{liu2023pristiconditionaldiffusionframework,
      title={PriSTI: A Conditional Diffusion Framework for Spatiotemporal Imputation},
      author={Mingzhe Liu and Han Huang and Hao Feng and Leilei Sun and Bowen Du and Yanjie Fu},
      year={2023},
      eprint={2302.09746},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2302.09746},
 }
@misc{kong2021diffwaveversatilediffusionmodel,
      title={DiffWave: A Versatile Diffusion Model for Audio Synthesis},
      author={Zhifeng Kong and Wei Ping and Jiaji Huang and Kexin Zhao and Bryan Catanzaro},
      year={2021},
      eprint={2009.09761},
      archivePrefix={arXiv},
      primaryClass={eess.AS},
      url={https://arxiv.org/abs/2009.09761},
 }
@ARTICLE{11087622,
  author={Liu, Xiaosi and Xu, Xiaowen and Liu, Zhidan and Li, Zhenjiang and Wu, Kaishun},
  journal={IEEE Transactions on Mobile Computing},
  title={Spatio-Temporal Diffusion Model for Cellular Traffic Generation},
  year={2026},
  volume={25},
  number={1},
  pages={257-271},
  keywords={Base stations;Diffusion models;Data models;Uncertainty;Predictive models;Generative adversarial networks;Knowledge graphs;Mobile computing;Telecommunication traffic;Semantics;Cellular traffic;data generation;diffusion model;spatio-temporal graph},
  doi={10.1109/TMC.2025.3591183}}
@misc{austin2023structureddenoisingdiffusionmodels,
    title={Structured Denoising Diffusion Models in Discrete State-Spaces},
    author={Jacob Austin and Daniel D. Johnson and Jonathan Ho and Daniel Tarlow and Rianne van den Berg},
    year={2023},
    eprint={2107.03006},
    archivePrefix={arXiv},
    primaryClass={cs.LG},
    url={https://arxiv.org/abs/2107.03006},
 }
@misc{li2022diffusionlmimprovescontrollabletext,
      title={Diffusion-LM Improves Controllable Text Generation},
      author={Xiang Lisa Li and John Thickstun and Ishaan Gulrajani and Percy Liang and Tatsunori B. Hashimoto},
      year={2022},
      eprint={2205.14217},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={httpsarxiv.orgabs2205.14217},
 }
@misc{meng2025aflnetyearslatercoverageguided,
      title={AFLNet Five Years Later: On Coverage-Guided Protocol Fuzzing},
      author={Ruijie Meng and Van-Thuan Pham and Marcel Böhme and Abhik Roychoudhury},
      year={2025},
      eprint={2412.20324},
      archivePrefix={arXiv},
      primaryClass={cs.SE},
      url={https://arxiv.org/abs/2412.20324},
 }
@misc{godefroid2017learnfuzzmachinelearninginput,
      title={Learn&Fuzz: Machine Learning for Input Fuzzing},
      author={Patrice Godefroid and Hila Peleg and Rishabh Singh},
      year={2017},
      eprint={1701.07232},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/1701.07232},
 }
@misc{she2019neuzzefficientfuzzingneural,
      title={NEUZZ: Efficient Fuzzing with Neural Program Smoothing},
      author={Dongdong She and Kexin Pei and Dave Epstein and Junfeng Yang and Baishakhi Ray and Suman Jana},
      year={2019},
      eprint={1807.05620},
      archivePrefix={arXiv},
      primaryClass={cs.CR},
      url={https://arxiv.org/abs/1807.05620},
 }
@misc{hoogeboom2021argmaxflowsmultinomialdiffusion,
      title={Argmax Flows and Multinomial Diffusion: Learning Categorical Distributions},
      author={Emiel Hoogeboom and Didrik Nielsen and Priyank Jaini and Patrick Forré and Max Welling},
      year={2021},
      eprint={2102.05379},
      archivePrefix={arXiv},
      primaryClass={stat.ML},
      url={https://arxiv.org/abs/2102.05379},
 }
@misc{dai2019transformerxlattentivelanguagemodels,
      title={Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context},
      author={Zihang Dai and Zhilin Yang and Yiming Yang and Jaime Carbonell and Quoc V. Le and Ruslan Salakhutdinov},
      year={2019},
      eprint={1901.02860},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/1901.02860},
 }
@misc{zhou2021informerefficienttransformerlong,
      title={Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting},
      author={Haoyi Zhou and Shanghang Zhang and Jieqi Peng and Shuai Zhang and Jianxin Li and Hui Xiong and Wancai Zhang},
      year={2021},
      eprint={2012.07436},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2012.07436},
 }
@misc{wu2022autoformerdecompositiontransformersautocorrelation,
      title={Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting},
      author={Haixu Wu and Jiehui Xu and Jianmin Wang and Mingsheng Long},
      year={2022},
      eprint={2106.13008},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2106.13008},
 }
@misc{zhou2022fedformerfrequencyenhanceddecomposed,
      title={FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting},
      author={Tian Zhou and Ziqing Ma and Qingsong Wen and Xue Wang and Liang Sun and Rong Jin},
      year={2022},
      eprint={2201.12740},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2201.12740},
 }
@article{2023,
   title={A Note on Extremal Sombor Indices of Trees with a Given Degree Sequence},
   volume={90},
   ISSN={0340-6253},
   url={http://dx.doi.org/10.46793/match.90-1.197D},
   DOI={10.46793/match.90-1.197d},
   number={1},
   journal={Match Communications in Mathematical and in Computer Chemistry},
   publisher={University Library in Kragujevac},
   author={Damjanović, Ivan and Milošević, Marko and Stevanović, Dragan},
   year={2023},
   pages={197–202} }
--- a/arxiv-style/template.tex
+++ b/arxiv-style/template.tex
@@ -1,214 +0,0 @@
 \documentclass{article}
 \usepackage{arxiv}
 \usepackage[utf8]{inputenc} % allow utf-8 input
 \usepackage[T1]{fontenc}    % use 8-bit T1 fonts
 \usepackage{hyperref}       % hyperlinks
 \usepackage{url}            % simple URL typesetting
 \usepackage{booktabs}       % professional-quality tables
 \usepackage{amsfonts}       % blackboard math symbols
 \usepackage{nicefrac}       % compact symbols for 1/2, etc.
 \usepackage{microtype}      % microtypography
 \usepackage{cleveref}       % smart cross-referencing
 \usepackage{lipsum}         % Can be removed after putting your text content
 \usepackage{graphicx}
 \usepackage{natbib}
 \usepackage{doi}
 \title{A template for the \emph{arxiv} style}
 % Here you can change the date presented in the paper title
 %\date{September 9, 1985}
 % Or remove it
 %\date{}
 \newif\ifuniqueAffiliation
 % Comment to use multiple affiliations variant of author block 
 \uniqueAffiliationtrue
 \ifuniqueAffiliation % Standard variant of author block
 \author{ \href{https://orcid.org/0000-0000-0000-0000}{\includegraphics[scale=0.06]{orcid.pdf}\hspace{1mm}David S.~Hippocampus}\thanks{Use footnote for providing further
 		information about author (webpage, alternative
 		address)---\emph{not} for acknowledging funding agencies.} \\
 	Department of Computer Science\\
 	Cranberry-Lemon University\\
 	Pittsburgh, PA 15213 \\
 	\texttt{hippo@cs.cranberry-lemon.edu} \\
 	%% examples of more authors
 	\And
 	\href{https://orcid.org/0000-0000-0000-0000}{\includegraphics[scale=0.06]{orcid.pdf}\hspace{1mm}Elias D.~Striatum} \\
 	Department of Electrical Engineering\\
 	Mount-Sheikh University\\
 	Santa Narimana, Levand \\
 	\texttt{stariate@ee.mount-sheikh.edu} \\
 	%% \AND
 	%% Coauthor \\
 	%% Affiliation \\
 	%% Address \\
 	%% \texttt{email} \\
 	%% \And
 	%% Coauthor \\
 	%% Affiliation \\
 	%% Address \\
 	%% \texttt{email} \\
 	%% \And
 	%% Coauthor \\
 	%% Affiliation \\
 	%% Address \\
 	%% \texttt{email} \\
 }
 \else
 % Multiple affiliations variant of author block
 \usepackage{authblk}
 \renewcommand\Authfont{\bfseries}
 \setlength{\affilsep}{0em}
 % box is needed for correct spacing with authblk
 \newbox{\orcid}\sbox{\orcid}{\includegraphics[scale=0.06]{orcid.pdf}} 
 \author[1]{%
 	\href{https://orcid.org/0000-0000-0000-0000}{\usebox{\orcid}\hspace{1mm}David S.~Hippocampus\thanks{\texttt{hippo@cs.cranberry-lemon.edu}}}%
 }
 \author[1,2]{%
 	\href{https://orcid.org/0000-0000-0000-0000}{\usebox{\orcid}\hspace{1mm}Elias D.~Striatum\thanks{\texttt{stariate@ee.mount-sheikh.edu}}}%
 }
 \affil[1]{Department of Computer Science, Cranberry-Lemon University, Pittsburgh, PA 15213}
 \affil[2]{Department of Electrical Engineering, Mount-Sheikh University, Santa Narimana, Levand}
 \fi
 % Uncomment to override  the `A preprint' in the header
 %\renewcommand{\headeright}{Technical Report}
 %\renewcommand{\undertitle}{Technical Report}
 \renewcommand{\shorttitle}{\textit{arXiv} Template}
 %%% Add PDF metadata to help others organize their library
 %%% Once the PDF is generated, you can check the metadata with
 %%% $ pdfinfo template.pdf
 \hypersetup{
 pdftitle={A template for the arxiv style},
 pdfsubject={q-bio.NC, q-bio.QM},
 pdfauthor={David S.~Hippocampus, Elias D.~Striatum},
 pdfkeywords={First keyword, Second keyword, More},
 }
 \begin{document}
 \maketitle
 \begin{abstract}
 	\lipsum[1]
 \end{abstract}
 % keywords can be removed
 \keywords{First keyword \and Second keyword \and More}
 \section{Introduction}
 \lipsum[2]
 \lipsum[3]
 \section{Headings: first level}
 \label{sec:headings}
 \lipsum[4] See Section \ref{sec:headings}.
 \subsection{Headings: second level}
 \lipsum[5]
 \begin{equation}
 	\xi _{ij}(t)=P(x_{t}=i,x_{t+1}=j|y,v,w;\theta)= {\frac {\alpha _{i}(t)a^{w_t}_{ij}\beta _{j}(t+1)b^{v_{t+1}}_{j}(y_{t+1})}{\sum _{i=1}^{N} \sum _{j=1}^{N} \alpha _{i}(t)a^{w_t}_{ij}\beta _{j}(t+1)b^{v_{t+1}}_{j}(y_{t+1})}}
 \end{equation}
 \subsubsection{Headings: third level}
 \lipsum[6]
 \paragraph{Paragraph}
 \lipsum[7]
 \section{Examples of citations, figures, tables, references}
 \label{sec:others}
 \subsection{Citations}
 Citations use \verb+natbib+. The documentation may be found at
 \begin{center}
 	\url{http://mirrors.ctan.org/macros/latex/contrib/natbib/natnotes.pdf}
 \end{center}
 Here is an example usage of the two main commands (\verb+citet+ and \verb+citep+): Some people thought a thing \citep{kour2014real, keshet2016prediction} but other people thought something else \citep{kour2014fast}. Many people have speculated that if we knew exactly why \citet{kour2014fast} thought this\dots
 \subsection{Figures}
 \lipsum[10]
 See Figure \ref{fig:fig1}. Here is how you add footnotes. \footnote{Sample of the first footnote.}
 \lipsum[11]
 \begin{figure}
 	\centering
 	\fbox{\rule[-.5cm]{4cm}{4cm} \rule[-.5cm]{4cm}{0cm}}
 	\caption{Sample figure caption.}
 	\label{fig:fig1}
 \end{figure}
 \subsection{Tables}
 See awesome Table~\ref{tab:table}.
 The documentation for \verb+booktabs+ (`Publication quality tables in LaTeX') is available from:
 \begin{center}
 	\url{https://www.ctan.org/pkg/booktabs}
 \end{center}
 \begin{table}
 	\caption{Sample table title}
 	\centering
 	\begin{tabular}{lll}
 		\toprule
 		\multicolumn{2}{c}{Part}                   \\
 		\cmidrule(r){1-2}
 		Name     & Description     & Size ($\mu$m) \\
 		\midrule
 		Dendrite & Input terminal  & $\sim$100     \\
 		Axon     & Output terminal & $\sim$10      \\
 		Soma     & Cell body       & up to $10^6$  \\
 		\bottomrule
 	\end{tabular}
 	\label{tab:table}
 \end{table}
 \subsection{Lists}
 \begin{itemize}
 	\item Lorem ipsum dolor sit amet
 	\item consectetur adipiscing elit.
 	\item Aliquam dignissim blandit est, in dictum tortor gravida eget. In ac rutrum magna.
 \end{itemize}
 \bibliographystyle{unsrtnat}
 \bibliography{references}  %%% Uncomment this line and comment out the ``thebibliography'' section below to use the external .bib file (using bibtex) .
 %%% Uncomment this section and comment out the \bibliography{references} line above to use inline references.
 % \begin{thebibliography}{1}
 % 	\bibitem{kour2014real}
 % 	George Kour and Raid Saabne.
 % 	\newblock Real-time segmentation of on-line handwritten arabic script.
 % 	\newblock In {\em Frontiers in Handwriting Recognition (ICFHR), 2014 14th
 % 			International Conference on}, pages 417--422. IEEE, 2014.
 % 	\bibitem{kour2014fast}
 % 	George Kour and Raid Saabne.
 % 	\newblock Fast classification of handwritten on-line arabic characters.
 % 	\newblock In {\em Soft Computing and Pattern Recognition (SoCPaR), 2014 6th
 % 			International Conference of}, pages 312--318. IEEE, 2014.
 % 	\bibitem{keshet2016prediction}
 % 	Keshet, Renato, Alina Maor, and George Kour.
 % 	\newblock Prediction-Based, Prioritized Market-Share Insight Extraction.
 % 	\newblock In {\em Advanced Data Mining and Applications (ADMA), 2016 12th International 
 %                       Conference of}, pages 81--94,2016.
 % \end{thebibliography}
 \end{document}
--- a/papers/md/A
+++ b/papers/md/A
@@ -0,0 +1,56 @@
 # Autoregressive Denoising Diffusion Models for Multivariate Probabilistic Time Series Forecasting
 **第一个问题**：请对论文的内容进行摘要总结，包含研究背景与问题、研究目的、方法、主要结果和结论，字数要求在150-300字之间，使用论文中的术语和概念。
 摘要总结：SCADA系统随互联网集成暴露于大量网络攻击，但现有研究缺乏用于评估安全方案有效性的自动化恶意流量生成工具。本文提出面向Modbus/TCP的恶意流量生成器，目标是从Snort NIDS规则自动提取特征并用Scapy生成对应的Modbus数据包，以在测试环境评估安全方案。方法包括解析Snort规则头与选项（content、offset），构造并修改MBAP与Modbus PDU/ADU头部与负载，封装为TCP/IP数据包并建立会话发送；提供详细算法与实验测试床（发送端/接收端/Snort NIDS与镜像端口）。主要结果显示该工具成功生成能触发指定Snort规则的恶意流量，Wireshark验证了字段值（如第9字节0x09），Snort记录与规则sid/msg一致。结论：该工具为SCADA安全评估提供可复现实验流量来源，并可扩展至其他协议（如DNP3）。
 **第二个问题**：请提取论文的摘要原文，摘要一般在Abstract之后，Introduction之前。
 Supervisory control and data acquisition (SCADA) systems are used to monitor and control several industrial functions such as: oil & gas, electricity, water, nuclear fusion, etc. Recently, the Internet connectivity to SCADA systems introduced new vulnerabilities to these systems and made it a target for immense amount of attacks. In the literature, several solutions have been developed to secure SCADA systems; however; the literature is lacking work directed at the development of tools to evaluate the effectiveness of such solutions. An essential requirement of such tools is the generation of normal and malicious SCADA traffic. In this paper, we present an automated tool to generate a malicious SCADA traffic to be used to evaluate such systems. We consider the traffic generation of the popular SCADA Modbus protocol. The characteristics of the generated traffic are derived from Snort network intrusion detection system (NIDS) Modbus rules. The tool uses Scapy to generate packets based on the extracted traffic features. We present the testing results for our tool. The tool is used to read a Snort rule file that contains Modbus rules to extract the required traffic features.
 **第三个问题**：请列出论文的全部作者，按照此格式：`作者1, 作者2, 作者3`。
 Rami Al-Dalky, Omar Abduljaleel, Khaled Salah, Hadi Otrok, Mahmoud Al-Qutayri
 **第四个问题**：请直接告诉我这篇论文发表在哪个会议或期刊，请不要推理或提供额外信息。
 2014 9th International Symposium on Communication Systems, Networks & Digital Signal Processing (CSNDSP)
 **第五个问题**：请详细描述这篇论文主要解决的核心问题，并用简洁的语言概述。
 核心问题：缺乏自动化、基于标准检测规则的SCADA恶意流量生成工具，无法有效评估防护与检测方案的实际效果。简述：从Snort的Modbus规则自动提取特征，用Scapy生成可触发这些规则的Modbus/TCP恶意数据包，在真实/仿真实验环境中验证安全方案。
 **第六个问题**：请告诉我这篇论文提出了哪些方法，请用最简洁的方式概括每个方法的核心思路。
 - 基于Snort规则的特征提取：解析规则头（协议、端口）与选项（content、offset），筛选Modbus相关规则（TCP/502）。
 - 头部与负载映射算法：依据offset将content分配到MBAP头、Modbus头或负载，必要时跨界写入并补齐payload。
 - 封装与发送流程：构造Modbus PDU/ADU，封装至TCP/IP（端口502），建立会话、发送、确认、关闭。
 - 实验测试床设计：发送端生成流量、接收端监听502端口、镜像端口供Snort抓取，Wireshark用于字段验证。
 **第七个问题**：请告诉我这篇论文所使用的数据集，包括数据集的名称和来源。
 本研究未使用公开数据集；使用来源为Snort NIDS规则文件（包含Modbus规则）作为流量特征输入，工具据此生成恶意Modbus/TCP数据包；实验数据来自测试床抓包与Snort告警日志（Wireshark与Snort输出）。
 **第八个问题**：请列举这篇论文评估方法的所有指标，并简要说明这些指标的作用。
 - 规则触发情况（告警条目、sid/msg匹配）：验证生成流量能否触发目标Snort规则，衡量有效性。
 - 报文字段正确性（Wireshark解析、关键字节值如第9字节0x09）：确认MBAP/Modbus头与payload按照规则设定生成。
 - 会话与传输成功率（TCP会话建立/关闭、端口502监听）：保障流量到达与被NIDS镜像捕获。
 - 规则覆盖数量（输入规则数与触发数一致性）：衡量工具对规则集的支持与完整性。
 **第九个问题**：请总结这篇论文实验的表现，包含具体的数值表现和实验结论。
 实验使用3条Modbus Snort规则作为输入，工具逐条生成恶意Modbus/TCP数据包；接收端Wireshark显示默认MBAP头并验证第9字节为0x09等关键值；Snort通过网络镜像端口捕获流量并产生3条对应告警，sid与msg与输入规则完全一致；测试床为三台工作站、1 Gbps链路、监听端口502。结论：该工具能够稳定、准确地生成可触发指定规则的恶意Modbus流量，满足SCADA安全评估对“可控恶意流量”的需求。
 **第十个问题**：请清晰地描述论文所作的工作，分别列举出动机和贡献点以及主要创新之处。
 - 动机：SCADA系统安全方案亟需在受控环境中用真实协议恶意流量进行评估，而现有研究缺少自动化流量生成工具。
 - 贡献点：
  1. 提出并实现基于Snort规则的Modbus恶意流量生成器（开源发布）。
  2. 设计规则到报文字段的映射与封装算法（MBAP/Modbus PDU/ADU到TCP/IP）。
  3. 构建评测测试床并验证工具在触发目标规则上的有效性与一致性。
 - 主要创新：
  - 首次把Snort NIDS规则直接转化为可执行的Modbus/TCP恶意数据包生成流程，实现从检测签名到攻击流量的自动闭环。
  - 细化offset/content到协议层级字段的自动映射，兼顾跨头部与payload写入的边界处理，保障生成包语义与触发条件精确匹配。
--- a/papers/md/Benchmarking
+++ b/papers/md/Benchmarking
@@ -0,0 +1,55 @@
 # Benchmarking of synthetic network data Reviewing challenges and approaches
 **第一个问题**：请对论文的内容进行摘要总结，包含研究背景与问题、研究目的、方法、主要结果和结论，字数要求在150-300字之间，使用论文中的术语和概念。
 论文聚焦NetFlow领域的合成数据质量评估缺乏标准化这一问题：NIDS训练/评估需要标注流量，但生成式模型产出的synthetic data质量难以用统一准则比较。作者通过文献综述归纳评价维度，面向NetFlow筛选并组织一组指标体系，将其聚合为Data Dissimilarity Score与Domain Dissimilarity Score，并在真实NetFlow基准数据上验证这些指标能区分同源/异源数据分布。进一步以WGAN与GPT-2生成数据做case study，利用真实数据的intra-/inter-dataset相似度建立上下界与基线，从而形成客观、模型无关的benchmark框架，用于比较不同生成器与训练过程中的数据质量变化。
 **第二个问题**：请提取论文的摘要原文，摘要一般在Abstract之后，Introduction之前。
 Datasets of labeled network traces are essential for a multitude of machine learning (ML) tasks in networking, yet their availability is hindered by privacy and maintenance concerns, such as data staleness. To overcome this limitation, synthetic network traces can often augment existing datasets. Unfortunately, current synthetic trace generation methods, which typically produce only aggregated flow statistics or a few selected packet attributes, do not always suffice, especially when model training relies on having features that are only，available from packet traces. This shortfall manifests in both insufficient statistical resemblance to real traces and suboptimal performance on ML tasks when employed for data augmentation. In this paper, we apply diffusion models to generate high-resolution synthetic network traffic traces. We present *NetDiffusion*1 , a tool that uses a finely-tuned, controlled variant of a Stable Diffusion model to generate synthetic network traffic that is high fidelity and conforms to protocol specifications. Our evaluation demonstrates that packet captures generated from NetDiffusion can achieve higher statistical similarity to real data and improved ML model performance than current state-of-the-art approaches (e.g., GAN-based approaches). Furthermore, our synthetic traces are compatible with common network analysis tools and support a myriad of network tasks, suggesting that NetDiffusion can serve a broader spectrum of network analysis and testing tasks, extending beyond ML-centric applications.
 **第三个问题**：请列出论文的全部作者，按照此格式：`作者1, 作者2, 作者3`。
 Maximilian Wolf, Julian Tritscher, Dieter Landes, Andreas Hotho, Daniel Schlör
 **第四个问题**：请直接告诉我这篇论文发表在哪个会议或期刊，请不要推理或提供额外信息。
 Computers & Security
 **第五个问题**：请详细描述这篇论文主要解决的核心问题，并用简洁的语言概述。
 核心问题是：NetFlow/网络流量合成（如GAN、GPT类生成器）越来越常用来缓解标注数据稀缺，但“合成数据到底有多像真实数据、是否能用于NIDS任务”缺少统一、可复现、可比较的质量标准，导致不同论文/生成器之间难以客观对比。论文用“多指标+结构化组织+基线区间”的方式把“分布相似性（data-driven）”与“领域可用性（domain-driven，如语法/任务表现）”统一到同一套benchmark流程中。
 **第六个问题**：请告诉我这篇论文提出了哪些方法，请用最简洁的方式概括每个方法的核心思路。
 (1) 指标综述与分类：回顾并按数据驱动/领域驱动等层级整理相似度与效用评价方法；
 (2) 指标集构建：面向NetFlow挑选一组可操作指标，并聚合为Data Dissimilarity Score与Domain Dissimilarity Score以降低对比复杂度；
 (3) 基线与上下界benchmark：在真实数据上计算intra-/inter-dataset分数范围作为参考区间，再把生成器输出映射到区间内形成“可解释的客观对照”；
 (4) 合成数据case study流程：对WGAN与GPT-2训练过程定期采样、做syntax checks过滤无效NetFlow，再计算两类dissimilarity并可视化训练轨迹。
 **第七个问题**：请告诉我这篇论文所使用的数据集，包括数据集的名称和来源。
 使用了三个NetFlow基准数据集：NF-CSE-CIC-IDS2018、NF-ToN-IoT、NF-UNSW-NB15；论文说明这些NetFlow数据基于Sarhan等人（2021）对原始数据集用同一NetFlow转换器转换到同一格式，以保证可比性。
 **第八个问题**：请列举这篇论文评估方法的所有指标，并简要说明这些指标的作用。
 论文最终用于benchmark的指标集（按Table 2分类）包括：①单变量分布：Jensen–Shannon divergence（衡量单特征分布差异）；②多变量关系：Pearson相关系数、Correlation ratio、Uncertainty coefficient（衡量数值-数值/数值-类别/类别-类别等相关结构是否一致）；③Population层面判别：Discriminator（Isolation Forest, One-Class SVM，用于区分真实/合成或刻画总体可分性）；④任务应用：TSTR与TRTS（分别“用合成训练测真实/用真实训练测合成”，并用F1-Score评估任务可用性，F1越高表示合成数据越能支撑有效分类）；⑤规则约束：NetFlow Syntax-Checks（如IP/端口/标注/正值约束、TCP标志与UDP一致性、in/out求和等，用于过滤结构或语义不合法的NetFlow）。
 **第九个问题**：请总结这篇论文实验的表现，包含具体的数值表现和实验结论。
 数值层面，论文将各指标归一到[0,1]区间，并把F1-Score转为(1−F1)以与“越小越好”的dissimilarity方向一致；同时用真实数据对比得到的intra-/inter-dataset分数分布（含最小/最大、分位数与中位数带）作为可解释的上下界基线，实验结果主要以训练历史曲线与区间带状图呈现，而非在正文给出单一对比表格数值。结论层面：Data Dissimilarity显示WGAN与GPT-2在训练中几乎都能把“数据分布”拟合到接近目标数据的水平；但Domain Dissimilarity显示两种模型在领域应用行为上与目标数据仍有明显差异，并且训练过程中“没有可见改进”，说明仅看分布相似不等价于任务/领域可用，必须同时采用data与domain两类评价。
 **第十个问题**：请清晰地描述论文所作的工作，分别列举出动机和贡献点以及主要创新之处。
 动机：合成NetFlow可缓解NIDS标注数据稀缺，但缺少“客观、标准化、可比较”的质量评估流程，阻碍不同生成器与不同论文结果的横向比较。
 贡献与创新：①系统性文献综述并指出评价标准不统一；②构建面向NetFlow的多指标benchmark系统，并把14个指标聚合为Data/Domain两类复合分数以便比较与调参；③在三套真实NetFlow基准上验证指标可区分同源/异源样本并形成基线区间（上下界）；④用WGAN与GPT-2做case study展示如何把生成数据“放入基线区间”进行客观评价；⑤开源发布benchmark框架与benchmark数据以便复用与复现实验。
 **第十一个问题**：这篇论文给出了一个在network generation领域的benchmark吗？
 是，但更准确地说它给出了“synthetic NetFlow data（网络流量生成的NetFlow表示）”的标准化benchmark：包含一套固定的指标集（聚合为Data/Domain Dissimilarity Score）、基于真实数据的intra-/inter-dataset上下界与基线范围、以及将GAN与GPT-2等生成器输出纳入该范围做客观对照的流程，并且作者声明发布了代码与benchmark数据以支持他人复用。
--- a/papers/md/NetDiffusion
+++ b/papers/md/NetDiffusion
@@ -0,0 +1,69 @@
 # NetDiffusion Network Data Augmentation Through Protocol-Constrained Traffic Gener
 **第一个问题**：请对论文的内容进行摘要总结，包含研究背景与问题、研究目的、方法、主要结果和结论，字数要求在150-300字之间，使用论文中的术语和概念。
 该文指出由于隐私与拓扑差异等限制，production traffic traces难以直接复用，研究实验需要network traffic 论文关注标注网络数据集因隐私与“data staleness”而稀缺，且现有合成方法多生成NetFlow或少量packet attributes，导致统计相似度与ML增益不足。作者提出NetDiffusion：用fine-tuned、controlled的Stable Diffusion生成高分辨率pcap级合成流量，并通过协议约束与后处理保证protocol specifications。评估表明其在JSD/TVD/HD上显著优于基线，并在数据增强的分类任务中相较GAN/NetShare获得更高准确率；合成pcap也与常用分析工具兼容，适用于更广泛的网络分析与测试场景。
 **第二个问题**：请提取论文的摘要原文，摘要一般在Abstract之后，Introduction之前。
 Datasets of labeled network traces are essential for a multitude of machine learning (ML) tasks in networking, yet their availability is hindered by privacy and maintenance concerns, such as data staleness. To overcome this limitation, synthetic network traces can often augment existing datasets. Unfortunately, current synthetic trace generation methods, which typically produce only aggregated flow statistics or a few selected packet attributes, do not always suffice, especially when model training relies on having features that are only，available from packet traces. This shortfall manifests in both insufficient statistical resemblance to real traces and suboptimal performance on ML tasks when employed for data augmentation. In this paper, we apply diffusion models to generate high-resolution synthetic network traffic traces. We present *NetDiffusion*1 , a tool that uses a finely-tuned, controlled variant of a Stable Diffusion model to generate synthetic network traffic that is high fidelity and conforms to protocol specifications. Our evaluation demonstrates that packet captures generated from NetDiffusion can achieve higher statistical similarity to real data and improved ML model performance than current state-of-the-art approaches (e.g., GAN-based approaches). Furthermore, our synthetic traces are compatible with common network analysis tools and support a myriad of network tasks, suggesting that NetDiffusion can serve a broader spectrum of network analysis and testing tasks, extending beyond ML-centric applications.
 **第三个问题**：请列出论文的全部作者，按照此格式：`作者1, 作者2, 作者3`。
 Xi Jiang, Shinan Liu, Aaron Gember-Jacobson, Arjun Nitin Bhagoji, Paul Schmitt, Francesco Bronzino, Nick Feamster
 **第四个问题**：请直接告诉我这篇论文发表在哪个会议或期刊，请不要推理或提供额外信息。
 Proceedings of the ACM on Measurement and Analysis of Computing Systems (Proc. ACM Meas. Anal. Comput. Syst.)
 **第五个问题**：请详细描述这篇论文主要解决的核心问题，并用简洁的语言概述。
 核心问题是：在隐私与维护成本限制下，难以获得可更新的标注packet traces；而现有合成方法通常只生成聚合的flow statistics或少量属性，无法满足依赖pcap特征的训练与分析，表现为统计相似度不足、用于数据增强时ML性能不佳。本文要解决的是“生成既高保真、又符合协议规范、还能直接以pcap形式用于下游工具/任务的合成网络流量”。
 **第六个问题**：请告诉我这篇论文提出了哪些方法，请用最简洁的方式概括每个方法的核心思路。
 （1）NetDiffusion生成框架：用受控的Stable Diffusion生成“network traffic image representations”，再产出pcap级合成流量，目标是高保真且协议一致。
 （2）LoRA微调：在Stable Diffusion上用LoRA做高效fine-tuning，使模型学到特定应用类别的流量纹理/模式。
 （3）ControlNet受控生成：在生成时约束生成区域与字段分布，使header/协议字段满足指定分布与协议要求。
 （4）Post-generation heuristic：对生成结果做启发式修正以进一步强化protocol conformance（字段细节纠偏）。
 **第七个问题**：请告诉我这篇论文所使用的数据集，包括数据集的名称和来源。
 论文使用的真实数据集是“pcap files capturing traffic from ten prominent applications”，覆盖三类宏服务：Video Streaming（Netflix/YouTube/Amazon/Twitch）、Video Conferencing（MS Teams/Google Meet/Zoom）、Social Media（Facebook/Twitter/Instagram），并明确来自三处数据来源文献[22,62,86]（表2中以引用号标注来源）。
 文中还说明“comprehensive dataset contains nearly 20,000 flows”，并在评估中随机采样10%用于可行性与一致性。
 另外作者开源了“sample datasets, pipeline, and results”。
 **第八个问题**：请列举这篇论文评估方法的所有指标，并简要说明这些指标的作用。
 统计相似性指标：Jensen–Shannon Divergence (JSD) 衡量分布的信息重叠；Total Variation Distance (TVD) 衡量两分布的最大差异（最坏情况偏差）；Hellinger Distance (HD) 对分布尾部更敏感，用于观察稀有事件/离群差异；三者取值0到1，越接近0相似度越高。
 任务效用指标：ML分类准确率（macro-level与micro-level），用于检验合成数据做数据增强/替代训练数据时，对下游识别任务的提升或退化。
 **第九个问题**：请总结这篇论文实验的表现，包含具体的数值表现和实验结论。
 统计相似性（表3）：NetDiffusion在pcap上对“all generated features”达到Avg. JSD/TVD/HD=0.04/0.04/0.05；在示例共同字段IPv4 protocol上为0.02/0.03/0.02，显著优于随机生成（pcap：0.82/0.99/0.95）且也优于NetShare在NetFlow上的整体指标（0.16/0.16/0.18）。
 下游分类（表4）：在“Synthetic/Real（NetDiffusion生成pcap训练、真实pcap测试）”场景，macro-level最高0.738（DT），micro-level最高0.262（DT）；同类NetShare（NetFlow）仅0.396（macro，RF）与0.140（micro，SVM）。
 在“Real/Synthetic”方向，NetDiffusion也给出macro 0.542（SVM）、micro 0.249（SVM），整体优于对应的NetShare micro 0.102（RF）。
 非ML可用性上，tcpreplay重放Amazon流量示例显示NetDiffusion生成与真实流量均为1024包且失败包为0，说明可解析与可重放；但总字节与速率存在差异，作者认为这与bit-level生成导致小偏差放大有关，并将更精细控制/后处理缩放留作未来工作。
 **第十个问题**：请清晰地描述论文所作的工作，分别列举出动机和贡献点以及主要创新之处。
 动机：标注packet traces稀缺且易过时，且只生成NetFlow/少量属性的合成方法无法支撑依赖pcap特征的训练与网络分析，导致相似度与ML增益不足。
 贡献点：（1）提出NetDiffusion工具，用扩散模型生成高分辨率合成网络流量并满足协议规范；（2）给出系统评估：与NetShare/随机生成对比，在统计相似度与分类任务上更优；（3）强调兼容常用网络分析工具，可用于更广谱的网络任务而非仅ML。
 主要创新：将“受控Stable Diffusion（fine-tuning + control）”引入pcap级流量生成，并通过控制与启发式后处理实现protocol-constrained traffic generation，使“raw network traffic in pcap format”的合成在相似度与实用性上都可落地。
 **第十一个问题**：这篇论文给出了一个在network generation领域的benchmark吗？
 它给出了“论文内的对比基准（benchmarking）”，即在统计相似性评估中将NetDiffusion与NetShare、以及naive random generation做基线对比，并用JSD/TVD/HD与分类准确率系统报告结果；但它并未提出一个面向整个network generation领域的统一标准化benchmark套件（多数据集、多任务、统一提交协议那种）。NetDiffusion Network Data Augme… 同时作者开源了样例数据、pipeline与结果，利于他人复现实验与做横向对比，但更像“可复现评测框架+数据示例”，而不是社区级benchmark定义。
--- a/papers/md/Network
+++ b/papers/md/Network
@@ -0,0 +1,95 @@
 # Network Traffic Generation A Survey and Methodology
 **第一个问题**：请对论文的内容进行摘要总结，包含研究背景与问题、研究目的、方法、主要结果和结论，字数要求在150-300字之间，使用论文中的术语和概念。
 该文指出由于隐私与拓扑差异等限制，production traffic traces难以直接复用，研究实验需要network traffic workloads而广泛依赖traffic generators。论文目标不是做性能对比，而是判定各工具的functional behaviors，并给出面向实验目标的selection methodology。方法上，作者用custom built analysis tool对ACM/USENIX等7,479篇论文做n-gram分析，汇编92个traffic generators并按usage popularity选出top 10，随后提出taxonomy（如constant/maximum throughput、trace replay、model-based、script driven等），并用表格化digests总结特性、header字段可配置性与reported metrics。结果显示constant/max throughput工具（尤以iperf2）长期占主导，而表格与流程可系统化指导工具选择。结论是：应以工作负载需求对齐工具能力，并建议通过wire上抓包验证指标。
 **第二个问题**：请提取论文的摘要原文，摘要一般在Abstract之后，Introduction之前。
 Network traffic workloads are widely utilized in applied research to verify correctness and to measure the impact of novel algorithms, protocols, and network functions. We provide a comprehensive survey of traffic generators referenced by researchers over the last 13 years, providing in-depth classification of the functional behaviors of the most frequently cited generators. These classifications are then used as a critical component of a methodology presented to aid in the selection of generators derived from the workload requirements of future research.
 **第三个问题**：请列出论文的全部作者，按照此格式：`作者1, 作者2, 作者3`。
 Oluwamayowa Ade Adeleke, Nicholas Bastin, Deniz Gurkan
 **第四个问题**：请直接告诉我这篇论文发表在哪个会议或期刊，请不要推理或提供额外信息。
 ACM Computing Surveys (CSUR)
 **第五个问题**：请详细描述这篇论文主要解决的核心问题，并用简洁的语言概述。
 论文要解决的核心问题是：在production traces难以获取/复用、且不同traffic generators能力差异巨大的情况下，研究者缺少一种“按实验目标选择合适traffic generator”的系统方法与对功能行为的清晰刻画。作者强调其关注点是functional behaviors（variances、functionality）而非性能，并通过对大量论文的usage证据、taxonomy与特性汇编，给出可操作的selection methodology来把workload requirements映射到工具能力。简洁概述：把“选工具”从经验主义变成基于需求与能力对齐的流程化决策。
 **第六个问题**：请告诉我这篇论文提出了哪些方法，请用最简洁的方式概括每个方法的核心思路。
 方法1：基于文献的工具发现与热度分析——用custom built analysis tool对7,479篇论文做n-gram检索与人工核验，得到92个traffic generators并按usage popularity排序、选出top 10。
 方法2：Taxonomy/分类框架——按“push packets into the network”的技术路径，把生成器划分为constant/maximum throughput、application-level synthetic workload、trace replay、model-based、script driven等类别。
 方法3：表格化特性与指标digest——用Table 3/4/5汇总常见实验需求特性、协议栈header字段可配置方式、以及工具自报reported metrics，为对比与筛选提供结构化依据。
 方法4：Traffic Generator Selection Methodology（含示例走查）——按“Requirements→Availability→Traffic characteristics→Features（用Tables 3/4筛）”的步骤，把需求逐步收敛到候选工具集合。
 **第七个问题**：请告诉我这篇论文所使用的数据集，包括数据集的名称和来源。
 数据集1（论文语境中的“corpus”）：作者构建的文献语料库——共7,479篇computer networking相关论文，其中2,856篇来自ACM SIGCOMM相关会议/期刊集合，4,623篇来自USENIX相关会议/期刊集合，时间跨度2006–2018，用于n-gram分析与usage统计。
 数据集2（工具清单来源）：92个traffic generators的汇编清单——来源于上述论文语料（over 7,000 papers）以及general internet document searches。
 数据集3（与trace replay相关的外部数据集类别）：论文指出研究者会从public data sets获取匿名且payload为空的trace files并用于重放（此处未在该段落给出具体数据集名称）。
 **第八个问题**：请列举这篇论文评估方法的所有指标，并简要说明这些指标的作用。
 指标1 Throughput：单位时间传输的数据量，用于衡量负载强度/带宽占用。
 指标2 Latency：发送到接收的时间间隔，用于衡量时延。
 指标3 Packet rate：单位时间到达的数据包数，用于衡量发包速率。
 指标4 Total no. of packets：整个生成过程发送的包总数，用于衡量总工作量规模。
 指标5 Total no. of bytes：整个生成过程发送的字节总量，用于衡量总数据量。
 指标6 Duration：生成过程耗时，用于与总量/速率联动解释实验时长。
 指标7 Jitter：时延抖动，用于衡量时延稳定性。
 指标8 No. of retransmissions：重传包数，用于反映拥塞/丢包/协议重传行为。
 指标9 No. of drops：丢包数，用于反映可靠性与网络/系统瓶颈。
 指标10 MSS：TCP最大报文段大小，用于刻画TCP分段相关配置。
 指标11 Congestion window size(s)：拥塞窗口大小，用于反映TCP拥塞控制状态。
 指标12 CPU demand：CPU占用，用于衡量生成器资源开销。
 指标13 Number of flows or connections：流/连接数量，用于刻画并发与连接多样性。
 指标14 Request/response transaction rates：请求-响应对的完成速率（面向request-response模型），用于衡量事务级吞吐。
 **第九个问题**：请总结这篇论文实验的表现，包含具体的数值表现和实验结论。
 该文自身不以性能“实验对比”为目标，而是给出基于文献证据的统计性结果：作者在2006–2018的论文语料中分析了7,479篇网络论文并汇编92个traffic generators。
 统计结论显示top 10按usage popularity依次为iperf2、netperf、httperf、moongen、scapy、linux pktgen、netcat、TCPreplay、iperf3、DPDK pktgen；并指出constant/max throughput generators（尤其iperf2）在使用上持续占主导。
 作为“已有工作中的性能数值例证”，论文综述他人实验称：在100 Mbps链路上不同工具测得带宽可相差16.5 Mbps，同一设置下Iperf测得93.1 Mbps而IP Traffic为76.7 Mbps，并据此强调不同生成器在不同场景下各有优劣、单一工具难覆盖所有网络类型。
 **第十个问题**：请清晰地描述论文所作的工作，分别列举出动机和贡献点以及主要创新之处。
 动机：production traffic traces受隐私与拓扑可复用性限制，实验需要traffic generators来构造workloads，但研究界缺少对工具能力差异的结构化理解与选择方法。
 贡献点1：构建并公开一套覆盖面广的survey与证据链——基于7,479篇论文的n-gram分析与人工核验，汇编92个traffic generators并给出top 10与使用趋势。
 贡献点2：提出taxonomy并给出各类别规模与解释，强调从“push packets into the network”的角度理解生成方式。
 贡献点3：提供结构化digests（Table 3/4/5）把“实验需求→特性/字段可配置性→可用指标”对齐，并提醒指标需用wire上抓包验证。
 主要创新之处：将“工具选择”流程化——提出Traffic Generator Selection Methodology，并用步骤化示例展示如何用需求与表格digest逐步收敛到候选工具集合（如最终筛到scapy/moongen/dpdk pktgen）。
 **第十一个问题**：这篇论文给出了一个在network generation领域的benchmark吗？
 这篇论文给出了一个在network generation领域的benchmark吗？没有。论文明确说明其目标“不是性能对比（performance comparison）”，而是对traffic generators的“功能行为（functional behaviors）”进行判定与归纳，并提出selection methodology来匹配实验目标；它做的是survey + 分类 + 特性/指标汇编（tables digests），而不是搭建统一测试平台去跑出可复现的benchmark排行。
Author	SHA1	Message	Date
Hongyu Yan	223965b107	Update .gitignore	2026-02-04 18:17:11 +08:00
Hongyu Yan	6d97c42f51	Merge branch 'master' of https://gitea.markyan04.cn/ModuFlow/internal-docs	2026-02-03 22:58:43 +08:00
Hongyu Yan	70957d099f	More Info	2026-02-03 22:58:33 +08:00