485 lines
14 KiB
Markdown
485 lines
14 KiB
Markdown
Yep — your DNS leak detector design **fits your current architecture cleanly**, and **making it a dedicated subcrate is the right move**.
|
||
|
||
Your existing workspace already follows “thin CLI + feature crates + platform traits + OS dispatch”, so the detector drops in naturally as **one new feature crate** plus a small extension to `wtfnet-platform` for **best-effort process attribution**. This matches the design’s pipeline idea: **Traffic Sensor → Classifier → (Process Attribution) → Leak Rules → Report**
|
||
|
||
Below I’ll do two things:
|
||
|
||
1. **Evaluate fit + integration plan** (what needs changing / what can be reused)
|
||
2. Provide **repo-ready v0.4.0 documents**, focused on implementing this detector.
|
||
|
||
---
|
||
|
||
# 1) Can this design fit your current architecture?
|
||
|
||
## ✅ Yes — it matches your crate boundaries almost 1:1
|
||
|
||
Your detector design already breaks down into modules that map directly into a new subcrate:
|
||
|
||
* Policy & Configuration
|
||
* Passive Traffic Sensor (UDP53/TCP53/853/“DoH-ish”)
|
||
* Classifier (DNS / DoT / DoH + confidence)
|
||
* DNS Parser (plaintext only)
|
||
* Flow Tracker (interface/route/process correlation)
|
||
* Leak Detector rules A/B/C/D
|
||
* Report generator + structured events
|
||
|
||
So: **new crate = `wtfnet-dnsleak`**, and keep `wtfnet-dns` for **active query/detect/watch** (already exists).
|
||
|
||
## ✅ What you can reuse immediately
|
||
|
||
### Reuse from your current codebase
|
||
|
||
* `wtfnet-dns watch` capture plumbing (you already have passive-ish capture)
|
||
* existing DNS parsing logic (for UDP/TCP 53)
|
||
* existing GeoIP enrichment pipeline (optional)
|
||
* `wtfnet-platform` interface snapshot + routing info (for “which interface leaked?”)
|
||
* your JSON output envelope + logging style
|
||
|
||
### Reuse from the detector design directly
|
||
|
||
* Leak definitions A/B/C/D (this is already precise and CLI-tool friendly)
|
||
* DoH recognition levels + confidence model (strong → weak)
|
||
* “safe DNS path” abstraction (interfaces/dests/process/ports)
|
||
* process attribution confidence levels and failure reasons
|
||
* privacy modes (Full/Redacted/Minimal)
|
||
|
||
## ✅ What you’ll need to add (small + contained)
|
||
|
||
### 1) New platform trait for flow → process owner (best effort)
|
||
|
||
Your design explicitly wants PID/PPID/process name to show “who leaked DNS” , and it recommends an OS-specific provider interface .
|
||
|
||
So extend `wtfnet-platform` with something like:
|
||
|
||
* `FlowOwnerProvider::owner_of(tuple, timestamp) -> ProcessInfo + confidence`
|
||
|
||
This stays consistent with your existing “platform traits + OS dispatch” architecture.
|
||
|
||
### 2) Route/interface classification (“tunnel vs physical vs loopback”)
|
||
|
||
Your event schema wants `route_class` (tunnel/physical/loopback) .
|
||
Implement this via **policy-configured interface labels** + a small heuristic fallback.
|
||
|
||
### 3) DoH detection will be heuristic (and that’s OK)
|
||
|
||
Your design already plans for this: DoH classification is “hard”, confidence-based .
|
||
In v0.4, keep it simple:
|
||
|
||
* DoT = TCP/853
|
||
* Plain DNS = UDP/TCP 53 (+ parse QNAME)
|
||
* “DoH-ish” = TCP/443 to known resolver host/IP OR “small HTTPS bursts” pattern
|
||
…and attach confidence.
|
||
|
||
---
|
||
|
||
# 2) v0.4.0 documents (repo-ready)
|
||
|
||
Below are two docs you can drop into `docs/`.
|
||
|
||
---
|
||
|
||
## `docs/RELEASE_v0.4.0.md`
|
||
|
||
```markdown
|
||
# WTFnet v0.4.0 — DNS Leak Detection
|
||
|
||
v0.4.0 introduces a client-side DNS leak detector aimed at censorship-resistance threat models:
|
||
detect when DNS behavior escapes the intended safe path. The detector focuses on evidence:
|
||
transport, interface, destination, and (best-effort) process attribution.
|
||
|
||
This release does NOT include HTTP/3 or OS-native TLS verification.
|
||
|
||
---
|
||
|
||
## 0) Summary
|
||
|
||
New major capability: `dns leak` command group.
|
||
|
||
Core idea:
|
||
Passive monitor captures outbound DNS-like traffic → classify (Plain DNS / DoT / DoH) →
|
||
enrich with interface/route/process metadata → evaluate leak definitions (A/B/C/D) →
|
||
emit events + summary report.
|
||
|
||
Leak definitions are explicit:
|
||
- Leak-A: plaintext DNS outside safe path
|
||
- Leak-B: split-policy intent leak (proxy-required domains resolved via ISP/local path)
|
||
- Leak-C: encrypted DNS escape/bypass (DoH/DoT outside approved egress)
|
||
- Leak-D: mismatch risk indicator (DNS egress differs from TCP/TLS egress)
|
||
|
||
---
|
||
|
||
## 1) Goals
|
||
|
||
### G1. Detect DNS leaks without needing special test domains
|
||
Passive detection must work continuously and produce evidence.
|
||
|
||
### G2. Support censorship-resistance leak definitions
|
||
Include both classic VPN-bypass leaks and split-policy intent leaks.
|
||
|
||
### G3. Best-effort process attribution
|
||
Attach PID/PPID/process name when OS allows; degrade gracefully with confidence.
|
||
|
||
### G4. Privacy-aware by default
|
||
Support privacy modes: Full / Redacted / Minimal.
|
||
|
||
---
|
||
|
||
## 2) Non-goals (v0.4.0)
|
||
|
||
- No "doctor" / smart one-shot diagnosis command
|
||
- No shell completions / man pages
|
||
- No HTTP/3 support
|
||
- No OS-native TLS verifier integration
|
||
- No firewall modification / "kill switch" management (detection only)
|
||
|
||
---
|
||
|
||
## 3) New crates / architecture changes
|
||
|
||
### 3.1 New subcrate: `wtfnet-dnsleak`
|
||
Responsibilities:
|
||
- passive sensor (pcap/pnet feature-gated)
|
||
- DNS parser (plaintext only)
|
||
- transport classifier: udp53/tcp53/dot/doh (confidence)
|
||
- flow tracker + metadata enrichment
|
||
- process attribution integration
|
||
- leak rules engine (A/B/C/D)
|
||
- structured event + summary report builder
|
||
|
||
### 3.2 `wtfnet-platform` extension: flow ownership lookup
|
||
Add a new trait:
|
||
- FlowOwnerProvider: map observed traffic 5-tuple → process info (best-effort)
|
||
|
||
Return process attribution confidence:
|
||
HIGH/MEDIUM/LOW/NONE plus failure reason.
|
||
|
||
---
|
||
|
||
## 4) CLI scope
|
||
|
||
### 4.1 Commands
|
||
New command group:
|
||
|
||
#### `wtfn dns leak watch`
|
||
Start passive monitoring for a bounded duration (default 10s):
|
||
- classify transports (udp53/tcp53/dot/doh)
|
||
- apply leak rules and emit events + summary
|
||
|
||
#### `wtfn dns leak status`
|
||
Print baseline snapshot:
|
||
- interfaces + routes
|
||
- system DNS configuration
|
||
- active policy summary
|
||
|
||
#### `wtfn dns leak report`
|
||
Parse a saved events file and produce a human summary.
|
||
|
||
### 4.2 Flags (proposed)
|
||
Common:
|
||
- `--duration <Ns|Nms>` (default 10s)
|
||
- `--iface <name>` (optional capture interface)
|
||
- `--policy <path>` (JSON policy file)
|
||
- `--profile <full-tunnel|proxy-stub|split>` (built-in presets)
|
||
- `--privacy <full|redacted|minimal>` (default redacted)
|
||
- `--geoip` (include GeoIP in event outputs)
|
||
- `--out <path>` (write JSON report/events)
|
||
|
||
---
|
||
|
||
## 5) Policy model (v0.4.0)
|
||
|
||
Safe DNS path constraints can be defined by:
|
||
- allowed interfaces: loopback/tunnel
|
||
- allowed destination set: proxy IPs, internal resolvers
|
||
- allowed processes: only local stub/proxy can resolve upstream
|
||
- allowed ports: e.g. only 443 to proxy server
|
||
|
||
A DNS event is a leak if it violates safe-path constraints.
|
||
|
||
Built-in profiles:
|
||
1) full-tunnel VPN style
|
||
2) proxy + local stub (default, censorship model)
|
||
3) split policy
|
||
|
||
---
|
||
|
||
## 6) Outputs
|
||
|
||
### 6.1 Leak events (structured)
|
||
Each LeakEvent includes:
|
||
- timestamp
|
||
- transport: udp53/tcp53/dot/doh/unknown
|
||
- qname/qtype (nullable)
|
||
- interface + route_class
|
||
- dst ip:port
|
||
- process info (nullable) + attribution confidence
|
||
- leak_type: A/B/C/D
|
||
- severity: P0..P3
|
||
- evidence fields + optional geoip
|
||
|
||
### 6.2 Summary report
|
||
- leak counts by type
|
||
- top leaking processes (if available)
|
||
- top resolver destinations
|
||
- timeline/burst hints
|
||
|
||
---
|
||
|
||
## 7) Deliverables checklist
|
||
|
||
MUST:
|
||
- new `wtfnet-dnsleak` crate integrated into workspace + CLI
|
||
- passive capture for UDP/TCP 53 and TCP 853
|
||
- DoH heuristic classification (confidence-based)
|
||
- policy engine + Leak-A/B/C/D rules
|
||
- structured events + human summary
|
||
- privacy modes full/redacted/minimal
|
||
- best-effort process attribution with confidence and failure reason
|
||
|
||
SHOULD:
|
||
- saved report file support (`--out report.json`)
|
||
- route_class inference with policy hints + heuristics
|
||
|
||
NICE:
|
||
- correlation_id (DNS → subsequent TCP/TLS connection) for Leak-D mismatch indicator
|
||
|
||
---
|
||
|
||
## 8) Definition of Done
|
||
|
||
- v0.4.0 builds on Linux (Debian/Ubuntu) and Windows
|
||
- `wtfn dns leak watch` detects:
|
||
- plaintext DNS leaving physical interface (Leak-A)
|
||
- DoT traffic leaving outside approved egress (Leak-C)
|
||
- DoH-ish encrypted resolver traffic outside policy (Leak-C)
|
||
- events include interface + dst + (best-effort) PID/process info
|
||
- output remains stable and additive; no breaking change to v0.3 commands
|
||
|
||
```
|
||
|
||
---
|
||
|
||
## `docs/DNS_LEAK_DETECTOR_IMPLEMENTATION.md`
|
||
|
||
```markdown
|
||
# DNS Leak Detector — Implementation Guide (v0.4)
|
||
|
||
This document explains how to implement the DNS leak detector as a new subcrate in WTFnet.
|
||
|
||
---
|
||
|
||
## 1) New crate: `wtfnet-dnsleak`
|
||
|
||
### 1.1 Module layout
|
||
|
||
crates/wtfnet-dnsleak/src/
|
||
- lib.rs
|
||
- policy.rs # safe path constraints + presets
|
||
- sensor.rs # passive capture -> normalized TrafficEvent stream
|
||
- classify.rs # transport classification + confidence
|
||
- parse_dns.rs # plaintext DNS parser: qname/qtype
|
||
- attrib.rs # process attribution integration (platform provider)
|
||
- route.rs # interface/route classification (tunnel/physical/loopback)
|
||
- rules.rs # Leak-A/B/C/D evaluation
|
||
- report.rs # LeakEvent + SummaryReport builders
|
||
- privacy.rs # full/redacted/minimal redaction logic
|
||
|
||
---
|
||
|
||
## 2) Core data types
|
||
|
||
### 2.1 TrafficEvent (raw from sensor)
|
||
Fields:
|
||
- ts: timestamp
|
||
- proto: udp/tcp
|
||
- src_ip, src_port
|
||
- dst_ip, dst_port
|
||
- iface_name (capture interface if known)
|
||
- payload: optional bytes (only for plaintext DNS parsing)
|
||
|
||
### 2.2 ClassifiedEvent
|
||
Adds:
|
||
- transport: udp53/tcp53/dot/doh/unknown
|
||
- doh_confidence: HIGH/MEDIUM/LOW (only if doh)
|
||
- qname/qtype: nullable
|
||
|
||
### 2.3 EnrichedEvent
|
||
Adds:
|
||
- route_class: loopback/tunnel/physical/unknown
|
||
- process info: pid/ppid/name (nullable)
|
||
- attribution_confidence: HIGH/MEDIUM/LOW/NONE
|
||
- attrib_failure_reason: optional string
|
||
- geoip: optional
|
||
|
||
### 2.4 LeakEvent (final output)
|
||
Adds:
|
||
- leak_type: A/B/C/D
|
||
- severity: P0..P3
|
||
- policy_rule_id
|
||
- evidence: minimal structured evidence
|
||
|
||
---
|
||
|
||
## 3) Platform integration: Process Attribution Engine (PAE)
|
||
|
||
### 3.1 Trait addition (wtfnet-platform)
|
||
|
||
Add:
|
||
trait FlowOwnerProvider {
|
||
fn owner_of(
|
||
&self,
|
||
proto: Proto,
|
||
src_ip: IpAddr,
|
||
src_port: u16,
|
||
dst_ip: IpAddr,
|
||
dst_port: u16,
|
||
ts: SystemTime,
|
||
) -> FlowOwnerResult;
|
||
}
|
||
|
||
FlowOwnerResult:
|
||
- pid, ppid, process_name (optional)
|
||
- confidence: HIGH/MEDIUM/LOW/NONE
|
||
- failure_reason: optional string
|
||
|
||
Design rule: attribution is best-effort and never blocks leak detection.
|
||
|
||
---
|
||
|
||
## 4) Transport classification logic
|
||
|
||
### 4.1 Plain DNS
|
||
Match:
|
||
- UDP dst port 53 OR TCP dst port 53
|
||
Parse QNAME/QTYPE from payload.
|
||
|
||
### 4.2 DoT
|
||
Match:
|
||
- TCP dst port 853
|
||
|
||
### 4.3 DoH (heuristic)
|
||
Match candidates:
|
||
- TCP dst port 443 AND (one of):
|
||
- dst IP in configured DoH resolver list
|
||
- dst SNI matches known DoH provider list (if available)
|
||
- frequent small HTTPS bursts pattern (weak)
|
||
|
||
Attach confidence:
|
||
- MEDIUM: known endpoint match
|
||
- LOW: traffic-shape heuristic only
|
||
|
||
Important: you mostly need to detect encrypted resolver traffic bypassing the proxy channel,
|
||
not to fully prove DoH with payload inspection.
|
||
|
||
---
|
||
|
||
## 5) Policy model
|
||
|
||
Policy defines "safe DNS path" constraints:
|
||
- allowed interfaces
|
||
- allowed destinations (IP/CIDR)
|
||
- allowed processes
|
||
- allowed ports
|
||
|
||
A DNS event is a leak if it violates safe-path constraints.
|
||
|
||
### 5.1 Built-in profiles
|
||
|
||
full-tunnel:
|
||
- allow DNS only via tunnel iface or loopback stub
|
||
- any UDP/TCP 53 on physical iface => Leak-A
|
||
|
||
proxy-stub (default):
|
||
- allow DNS only to loopback stub
|
||
- allow stub upstream only to proxy destinations
|
||
- flag direct DoH/DoT outside proxy path => Leak-C
|
||
|
||
split:
|
||
- allow plaintext DNS only for allowlist
|
||
- enforce unknown => proxy resolve (Leak-B)
|
||
|
||
---
|
||
|
||
## 6) Leak rules (A/B/C/D)
|
||
|
||
Leak-A (plaintext escape):
|
||
- transport udp53/tcp53
|
||
- route_class != allowed
|
||
- dst not in allowed destination set
|
||
|
||
Leak-B (split policy intent leak):
|
||
- qname matches proxy-required set or "unknown"
|
||
- query observed going to ISP/domicile resolver or non-tunnel iface
|
||
|
||
Leak-C (encrypted bypass):
|
||
- DoT or DoH flow exists
|
||
- not via approved egress path (iface/destination)
|
||
|
||
Leak-D (mismatch indicator):
|
||
- correlate qname to later TCP/TLS flows (optional v0.4 NICE)
|
||
|
||
---
|
||
|
||
## 7) Privacy modes
|
||
|
||
Because domains and cmdlines are sensitive, support:
|
||
- Full: store full qname and cmdline
|
||
- Redacted (default): hash qname or keep eTLD+1 only; truncate cmdline
|
||
- Minimal: no domains/cmdline; keep leak counts + resolver IPs + process name
|
||
|
||
Privacy mode applies in report builder, not in sensor.
|
||
|
||
---
|
||
|
||
## 8) CLI integration
|
||
|
||
Add under `dns` command group:
|
||
|
||
- `dns leak status`
|
||
- `dns leak watch`
|
||
- `dns leak report`
|
||
|
||
`watch` returns:
|
||
- summary report (human) by default
|
||
- `--json` returns structured report with events list
|
||
|
||
---
|
||
|
||
## 9) Recommended incremental build plan
|
||
|
||
Phase 1 (core passive detection):
|
||
- sensor: udp/tcp capture
|
||
- classify: udp53/tcp53/dot
|
||
- parse plaintext qname/qtype
|
||
- policy: allowlist + allowed interfaces/dests
|
||
- leak rules: Leak-A + Leak-C (DoT)
|
||
- report: events + summary
|
||
|
||
Phase 2 (process attribution + DoH heuristics):
|
||
- platform FlowOwnerProvider impls
|
||
- DoH heuristic classification + confidence
|
||
- privacy modes
|
||
|
||
Phase 3 (optional correlation / Leak-D):
|
||
- flow tracker correlating DNS -> TCP/TLS connect events
|
||
- mismatch indicator output
|
||
|
||
```
|
||
|
||
---
|
||
|
||
# Small note about *where* it lives in your architecture
|
||
|
||
This design slots in with **minimal churn**:
|
||
|
||
* ✅ Add `crates/wtfnet-dnsleak`
|
||
* ✅ Add `dns leak ...` subcommands in CLI
|
||
* ✅ Add a **single** new platform trait for process attribution (best-effort)
|
||
* ✅ Reuse your existing `dns watch` capture approach as the sensor
|
||
|
||
…which is exactly what your design describes: passive monitoring + classification + rules + evidence output and the PAE “event enricher” location in the pipeline .
|
||
|
||
**If it's too hard to detect DoH traffic, skip it.**
|
||
|
||
---
|