Add: dns leak detection

2026-01-17 18:45:24 +08:00
parent ccd4a31d21
commit cfa96bde08
30 changed files with 3973 additions and 16 deletions
--- a/docs/requirement_docs_v0.4.md
+++ b/docs/requirement_docs_v0.4.md
@@ -0,0 +1,484 @@
+Yep — your DNS leak detector design **fits your current architecture cleanly**, and **making it a dedicated subcrate is the right move**.
+
+Your existing workspace already follows “thin CLI + feature crates + platform traits + OS dispatch”, so the detector drops in naturally as **one new feature crate** plus a small extension to `wtfnet-platform` for **best-effort process attribution**. This matches the design’s pipeline idea: **Traffic Sensor → Classifier → (Process Attribution) → Leak Rules → Report**  
+
+Below I’ll do two things:
+
+1. **Evaluate fit + integration plan** (what needs changing / what can be reused)
+2. Provide **repo-ready v0.4.0 documents**, focused on implementing this detector.
+
+---
+
+# 1) Can this design fit your current architecture?
+
+## ✅ Yes — it matches your crate boundaries almost 1:1
+
+Your detector design already breaks down into modules that map directly into a new subcrate:
+
+* Policy & Configuration 
+* Passive Traffic Sensor (UDP53/TCP53/853/“DoH-ish”) 
+* Classifier (DNS / DoT / DoH + confidence) 
+* DNS Parser (plaintext only) 
+* Flow Tracker (interface/route/process correlation) 
+* Leak Detector rules A/B/C/D 
+* Report generator + structured events  
+
+So: **new crate = `wtfnet-dnsleak`**, and keep `wtfnet-dns` for **active query/detect/watch** (already exists).
+
+## ✅ What you can reuse immediately
+
+### Reuse from your current codebase
+
+* `wtfnet-dns watch` capture plumbing (you already have passive-ish capture)
+* existing DNS parsing logic (for UDP/TCP 53)
+* existing GeoIP enrichment pipeline (optional)
+* `wtfnet-platform` interface snapshot + routing info (for “which interface leaked?”) 
+* your JSON output envelope + logging style
+
+### Reuse from the detector design directly
+
+* Leak definitions A/B/C/D (this is already precise and CLI-tool friendly) 
+* DoH recognition levels + confidence model (strong → weak) 
+* “safe DNS path” abstraction (interfaces/dests/process/ports) 
+* process attribution confidence levels and failure reasons 
+* privacy modes (Full/Redacted/Minimal) 
+
+## ✅ What you’ll need to add (small + contained)
+
+### 1) New platform trait for flow → process owner (best effort)
+
+Your design explicitly wants PID/PPID/process name to show “who leaked DNS” , and it recommends an OS-specific provider interface .
+
+So extend `wtfnet-platform` with something like:
+
+* `FlowOwnerProvider::owner_of(tuple, timestamp) -> ProcessInfo + confidence`
+
+This stays consistent with your existing “platform traits + OS dispatch” architecture.
+
+### 2) Route/interface classification (“tunnel vs physical vs loopback”)
+
+Your event schema wants `route_class` (tunnel/physical/loopback) .
+Implement this via **policy-configured interface labels** + a small heuristic fallback.
+
+### 3) DoH detection will be heuristic (and that’s OK)
+
+Your design already plans for this: DoH classification is “hard”, confidence-based .
+In v0.4, keep it simple:
+
+* DoT = TCP/853
+* Plain DNS = UDP/TCP 53 (+ parse QNAME)
+* “DoH-ish” = TCP/443 to known resolver host/IP OR “small HTTPS bursts” pattern
+  …and attach confidence.
+
+---
+
+# 2) v0.4.0 documents (repo-ready)
+
+Below are two docs you can drop into `docs/`.
+
+---
+
+## `docs/RELEASE_v0.4.0.md`
+
+```markdown
+# WTFnet v0.4.0 — DNS Leak Detection
+
+v0.4.0 introduces a client-side DNS leak detector aimed at censorship-resistance threat models:
+detect when DNS behavior escapes the intended safe path. The detector focuses on evidence:
+transport, interface, destination, and (best-effort) process attribution.
+
+This release does NOT include HTTP/3 or OS-native TLS verification.
+
+---
+
+## 0) Summary
+
+New major capability: `dns leak` command group.
+
+Core idea:
+Passive monitor captures outbound DNS-like traffic → classify (Plain DNS / DoT / DoH) →
+enrich with interface/route/process metadata → evaluate leak definitions (A/B/C/D) →
+emit events + summary report.
+
+Leak definitions are explicit:
+- Leak-A: plaintext DNS outside safe path
+- Leak-B: split-policy intent leak (proxy-required domains resolved via ISP/local path)
+- Leak-C: encrypted DNS escape/bypass (DoH/DoT outside approved egress)
+- Leak-D: mismatch risk indicator (DNS egress differs from TCP/TLS egress)
+
+---
+
+## 1) Goals
+
+### G1. Detect DNS leaks without needing special test domains
+Passive detection must work continuously and produce evidence.
+
+### G2. Support censorship-resistance leak definitions
+Include both classic VPN-bypass leaks and split-policy intent leaks.
+
+### G3. Best-effort process attribution
+Attach PID/PPID/process name when OS allows; degrade gracefully with confidence.
+
+### G4. Privacy-aware by default
+Support privacy modes: Full / Redacted / Minimal.
+
+---
+
+## 2) Non-goals (v0.4.0)
+
+- No "doctor" / smart one-shot diagnosis command
+- No shell completions / man pages
+- No HTTP/3 support
+- No OS-native TLS verifier integration
+- No firewall modification / "kill switch" management (detection only)
+
+---
+
+## 3) New crates / architecture changes
+
+### 3.1 New subcrate: `wtfnet-dnsleak`
+Responsibilities:
+- passive sensor (pcap/pnet feature-gated)
+- DNS parser (plaintext only)
+- transport classifier: udp53/tcp53/dot/doh (confidence)
+- flow tracker + metadata enrichment
+- process attribution integration
+- leak rules engine (A/B/C/D)
+- structured event + summary report builder
+
+### 3.2 `wtfnet-platform` extension: flow ownership lookup
+Add a new trait:
+- FlowOwnerProvider: map observed traffic 5-tuple → process info (best-effort)
+
+Return process attribution confidence:
+HIGH/MEDIUM/LOW/NONE plus failure reason.
+
+---
+
+## 4) CLI scope
+
+### 4.1 Commands
+New command group:
+
+#### `wtfn dns leak watch`
+Start passive monitoring for a bounded duration (default 10s):
+- classify transports (udp53/tcp53/dot/doh)
+- apply leak rules and emit events + summary
+
+#### `wtfn dns leak status`
+Print baseline snapshot:
+- interfaces + routes
+- system DNS configuration
+- active policy summary
+
+#### `wtfn dns leak report`
+Parse a saved events file and produce a human summary.
+
+### 4.2 Flags (proposed)
+Common:
+- `--duration <Ns|Nms>` (default 10s)
+- `--iface <name>` (optional capture interface)
+- `--policy <path>` (JSON policy file)
+- `--profile <full-tunnel|proxy-stub|split>` (built-in presets)
+- `--privacy <full|redacted|minimal>` (default redacted)
+- `--geoip` (include GeoIP in event outputs)
+- `--out <path>` (write JSON report/events)
+
+---
+
+## 5) Policy model (v0.4.0)
+
+Safe DNS path constraints can be defined by:
+- allowed interfaces: loopback/tunnel
+- allowed destination set: proxy IPs, internal resolvers
+- allowed processes: only local stub/proxy can resolve upstream
+- allowed ports: e.g. only 443 to proxy server
+
+A DNS event is a leak if it violates safe-path constraints.
+
+Built-in profiles:
+1) full-tunnel VPN style
+2) proxy + local stub (default, censorship model)
+3) split policy
+
+---
+
+## 6) Outputs
+
+### 6.1 Leak events (structured)
+Each LeakEvent includes:
+- timestamp
+- transport: udp53/tcp53/dot/doh/unknown
+- qname/qtype (nullable)
+- interface + route_class
+- dst ip:port
+- process info (nullable) + attribution confidence
+- leak_type: A/B/C/D
+- severity: P0..P3
+- evidence fields + optional geoip
+
+### 6.2 Summary report
+- leak counts by type
+- top leaking processes (if available)
+- top resolver destinations
+- timeline/burst hints
+
+---
+
+## 7) Deliverables checklist
+
+MUST:
+- new `wtfnet-dnsleak` crate integrated into workspace + CLI
+- passive capture for UDP/TCP 53 and TCP 853
+- DoH heuristic classification (confidence-based)
+- policy engine + Leak-A/B/C/D rules
+- structured events + human summary
+- privacy modes full/redacted/minimal
+- best-effort process attribution with confidence and failure reason
+
+SHOULD:
+- saved report file support (`--out report.json`)
+- route_class inference with policy hints + heuristics
+
+NICE:
+- correlation_id (DNS → subsequent TCP/TLS connection) for Leak-D mismatch indicator
+
+---
+
+## 8) Definition of Done
+
+- v0.4.0 builds on Linux (Debian/Ubuntu) and Windows
+- `wtfn dns leak watch` detects:
+  - plaintext DNS leaving physical interface (Leak-A)
+  - DoT traffic leaving outside approved egress (Leak-C)
+  - DoH-ish encrypted resolver traffic outside policy (Leak-C)
+- events include interface + dst + (best-effort) PID/process info
+- output remains stable and additive; no breaking change to v0.3 commands
+
+```
+
+---
+
+## `docs/DNS_LEAK_DETECTOR_IMPLEMENTATION.md`
+
+```markdown
+# DNS Leak Detector — Implementation Guide (v0.4)
+
+This document explains how to implement the DNS leak detector as a new subcrate in WTFnet.
+
+---
+
+## 1) New crate: `wtfnet-dnsleak`
+
+### 1.1 Module layout
+
+crates/wtfnet-dnsleak/src/
+- lib.rs
+- policy.rs          # safe path constraints + presets
+- sensor.rs          # passive capture -> normalized TrafficEvent stream
+- classify.rs        # transport classification + confidence
+- parse_dns.rs       # plaintext DNS parser: qname/qtype
+- attrib.rs          # process attribution integration (platform provider)
+- route.rs           # interface/route classification (tunnel/physical/loopback)
+- rules.rs           # Leak-A/B/C/D evaluation
+- report.rs          # LeakEvent + SummaryReport builders
+- privacy.rs         # full/redacted/minimal redaction logic
+
+---
+
+## 2) Core data types
+
+### 2.1 TrafficEvent (raw from sensor)
+Fields:
+- ts: timestamp
+- proto: udp/tcp
+- src_ip, src_port
+- dst_ip, dst_port
+- iface_name (capture interface if known)
+- payload: optional bytes (only for plaintext DNS parsing)
+
+### 2.2 ClassifiedEvent
+Adds:
+- transport: udp53/tcp53/dot/doh/unknown
+- doh_confidence: HIGH/MEDIUM/LOW (only if doh)
+- qname/qtype: nullable
+
+### 2.3 EnrichedEvent
+Adds:
+- route_class: loopback/tunnel/physical/unknown
+- process info: pid/ppid/name (nullable)
+- attribution_confidence: HIGH/MEDIUM/LOW/NONE
+- attrib_failure_reason: optional string
+- geoip: optional
+
+### 2.4 LeakEvent (final output)
+Adds:
+- leak_type: A/B/C/D
+- severity: P0..P3
+- policy_rule_id
+- evidence: minimal structured evidence
+
+---
+
+## 3) Platform integration: Process Attribution Engine (PAE)
+
+### 3.1 Trait addition (wtfnet-platform)
+
+Add:
+trait FlowOwnerProvider {
+  fn owner_of(
+    &self,
+    proto: Proto,
+    src_ip: IpAddr,
+    src_port: u16,
+    dst_ip: IpAddr,
+    dst_port: u16,
+    ts: SystemTime,
+  ) -> FlowOwnerResult;
+}
+
+FlowOwnerResult:
+- pid, ppid, process_name (optional)
+- confidence: HIGH/MEDIUM/LOW/NONE
+- failure_reason: optional string
+
+Design rule: attribution is best-effort and never blocks leak detection.
+
+---
+
+## 4) Transport classification logic
+
+### 4.1 Plain DNS
+Match:
+- UDP dst port 53 OR TCP dst port 53
+Parse QNAME/QTYPE from payload.
+
+### 4.2 DoT
+Match:
+- TCP dst port 853
+
+### 4.3 DoH (heuristic)
+Match candidates:
+- TCP dst port 443 AND (one of):
+  - dst IP in configured DoH resolver list
+  - dst SNI matches known DoH provider list (if available)
+  - frequent small HTTPS bursts pattern (weak)
+
+Attach confidence:
+- MEDIUM: known endpoint match
+- LOW: traffic-shape heuristic only
+
+Important: you mostly need to detect encrypted resolver traffic bypassing the proxy channel,
+not to fully prove DoH with payload inspection.
+
+---
+
+## 5) Policy model
+
+Policy defines "safe DNS path" constraints:
+- allowed interfaces
+- allowed destinations (IP/CIDR)
+- allowed processes
+- allowed ports
+
+A DNS event is a leak if it violates safe-path constraints.
+
+### 5.1 Built-in profiles
+
+full-tunnel:
+- allow DNS only via tunnel iface or loopback stub
+- any UDP/TCP 53 on physical iface => Leak-A
+
+proxy-stub (default):
+- allow DNS only to loopback stub
+- allow stub upstream only to proxy destinations
+- flag direct DoH/DoT outside proxy path => Leak-C
+
+split:
+- allow plaintext DNS only for allowlist
+- enforce unknown => proxy resolve (Leak-B)
+
+---
+
+## 6) Leak rules (A/B/C/D)
+
+Leak-A (plaintext escape):
+- transport udp53/tcp53
+- route_class != allowed
+- dst not in allowed destination set
+
+Leak-B (split policy intent leak):
+- qname matches proxy-required set or "unknown"
+- query observed going to ISP/domicile resolver or non-tunnel iface
+
+Leak-C (encrypted bypass):
+- DoT or DoH flow exists
+- not via approved egress path (iface/destination)
+
+Leak-D (mismatch indicator):
+- correlate qname to later TCP/TLS flows (optional v0.4 NICE)
+
+---
+
+## 7) Privacy modes
+
+Because domains and cmdlines are sensitive, support:
+- Full: store full qname and cmdline
+- Redacted (default): hash qname or keep eTLD+1 only; truncate cmdline
+- Minimal: no domains/cmdline; keep leak counts + resolver IPs + process name
+
+Privacy mode applies in report builder, not in sensor.
+
+---
+
+## 8) CLI integration
+
+Add under `dns` command group:
+
+- `dns leak status`
+- `dns leak watch`
+- `dns leak report`
+
+`watch` returns:
+- summary report (human) by default
+- `--json` returns structured report with events list
+
+---
+
+## 9) Recommended incremental build plan
+
+Phase 1 (core passive detection):
+- sensor: udp/tcp capture
+- classify: udp53/tcp53/dot
+- parse plaintext qname/qtype
+- policy: allowlist + allowed interfaces/dests
+- leak rules: Leak-A + Leak-C (DoT)
+- report: events + summary
+
+Phase 2 (process attribution + DoH heuristics):
+- platform FlowOwnerProvider impls
+- DoH heuristic classification + confidence
+- privacy modes
+
+Phase 3 (optional correlation / Leak-D):
+- flow tracker correlating DNS -> TCP/TLS connect events
+- mismatch indicator output
+
+```
+
+---
+
+# Small note about *where* it lives in your architecture
+
+This design slots in with **minimal churn**:
+
+* ✅ Add `crates/wtfnet-dnsleak`
+* ✅ Add `dns leak ...` subcommands in CLI
+* ✅ Add a **single** new platform trait for process attribution (best-effort)
+* ✅ Reuse your existing `dns watch` capture approach as the sensor
+
+…which is exactly what your design describes: passive monitoring + classification + rules + evidence output  and the PAE “event enricher” location in the pipeline .
+
+**If it's too hard to detect DoH traffic, skip it.**
+
+---