Files
WTFnet/docs/implementation_notes.md
2026-01-16 00:38:03 +08:00

661 lines
14 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# IMPLEMENTATION.md
## 0) Project identity
* **Project name:** WTFnet
* **Binary name:** `wtfn`
* **Tagline:** *“What the f*ck is my networking doing?”*
Target OS (first-class):
* ✅ Linux (Debian/Ubuntu)
* ✅ Windows 10/11 + Windows Server
---
## 1) Workspace layout (Cargo)
Recommended workspace structure: keep the CLI thin, keep logic reusable, and isolate OS-specific code.
```
wtfnet/
├─ Cargo.toml
├─ crates/
│ ├─ wtfnet-cli/ # bin: clap parsing + output formatting
│ ├─ wtfnet-core/ # shared types: errors, output schema, config, logging init
│ ├─ wtfnet-platform/ # platform traits + OS dispatch
│ ├─ wtfnet-platform-linux/ # Linux implementations (netlink/procfs)
│ ├─ wtfnet-platform-windows/ # Windows implementations (Win32 APIs)
│ ├─ wtfnet-geoip/ # GeoLite2 Country+ASN mmdb
│ ├─ wtfnet-probe/ # ping/tcping/trace + geoip enrichment
│ ├─ wtfnet-dns/ # query/detect/watch
│ ├─ wtfnet-http/ # HTTP/1.1, HTTP/2, optional HTTP/3
│ ├─ wtfnet-tls/ # TLS handshake/cert/verify/alpn
│ ├─ wtfnet-discover/ # mdns/ssdp discovery
│ └─ wtfnet-diag/ # diag bundle orchestrator
└─ docs/ # design/usage docs (optional)
```
### Root `Cargo.toml` (workspace)
```toml
[workspace]
resolver = "2"
members = [
"crates/wtfnet-cli",
"crates/wtfnet-core",
"crates/wtfnet-platform",
"crates/wtfnet-platform-linux",
"crates/wtfnet-platform-windows",
"crates/wtfnet-geoip",
"crates/wtfnet-probe",
"crates/wtfnet-dns",
"crates/wtfnet-http",
"crates/wtfnet-tls",
"crates/wtfnet-discover",
"crates/wtfnet-diag",
]
[workspace.package]
edition = "2021"
license = "MIT"
```
---
## 2) Subcrate responsibilities
### 2.1 `wtfnet-cli` (binary)
**Purpose**
* Defines `clap` subcommands & global flags (`--json`, `--pretty`, logging flags, timeouts)
* Calls into library crates
* Converts results → human tables OR JSON
**Strict rule**
* No OS-specific code here.
* No heavy logic here.
---
### 2.2 `wtfnet-core` (shared kernel)
**Owns**
* Output schema wrapper (`meta`, `command`, `data`, `warnings`, `errors`)
* Error taxonomy + exit code mapping
* Config loading (flags/env/config.json)
* Logging initialization (`tracing`)
* Formatting helpers (durations, bytes, IP formatting)
* “human table renderer” utilities (optional)
---
### 2.3 `wtfnet-platform` (traits + dispatch)
**Purpose**
Expose a stable interface for sysadmin-ish data:
* sys: interfaces/IP/route/DNS config snapshot
* ports: listening sockets + PID/process info
* cert roots: enumerate trusted roots
* neigh: ARP/NDP cache
**Pattern**
* Define Rust traits like `SysProvider`, `PortsProvider`, `CertProvider`, `NeighProvider`
* Provide a `Platform` object that selects implementation by target OS
---
### 2.4 `wtfnet-platform-linux`
Linux implementations:
* netlink-based route/IP/neigh: use `rtnetlink` ([Docs.rs][1])
* ports/process mapping:
* either use `listeners` (cross-platform) or Linux procfs parsing
* DNS snapshot: parse `/etc/resolv.conf` + detect `systemd-resolved` best-effort
---
### 2.5 `wtfnet-platform-windows`
Windows implementations:
* use `windows` crate / Win32 APIs
* interfaces/routes/neigh via IP Helper APIs
* ports/process via Windows socket/process APIs
* trusted roots via Windows cert store APIs
---
### 2.6 `wtfnet-geoip`
* Loads **local** GeoLite2 Country + ASN mmdb
* Provides one `GeoIpService` with `lookup(ip)` returning:
* country + ISO
* ASN + org
---
### 2.7 `wtfnet-probe`
Implements:
* ping (v4/v6)
* tcping (connect latency)
* trace (traceroute-like)
* optional `--geoip` enrichment
For ping, you can start with `surge-ping` ([Crates][2]) or `ping-async` ([Docs.rs][3]) (both async-oriented).
---
### 2.8 `wtfnet-dns`
Implements:
* `dns query` (dig-like)
* `dns detect` (poisoning compare)
* `dns watch` (passive, best-effort)
Use **Hickory DNS** (Trust-DNS rebrand) ([Docs.rs][4]) for a solid resolver/client base.
---
### 2.9 `wtfnet-http`
Implements:
* `http head|get`
* HTTP/2 support is required (via `reqwest`/`hyper`)
* HTTP/3 optional behind feature flag:
* `h3` + `h3-quinn` + `quinn` ([GitHub][5])
---
### 2.10 `wtfnet-tls`
Implements:
* `tls handshake|cert|verify|alpn`
* Use `rustls` for handshake parsing
* For system trust verification:
* `rustls-native-certs` for loading OS roots ([Crates][6])
* optionally `rustls-platform-verifier` for “verify like the OS” behavior ([GitHub][7])
---
### 2.11 `wtfnet-discover`
Implements bounded local discovery:
* mDNS service discovery: `mdns-sd` ([Crates][8])
* SSDP discovery: `ssdp-client` ([Crates][9])
---
### 2.12 `wtfnet-diag`
Orchestrates:
* sys snapshot
* routes
* ports listen
* neigh list
* optional quick checks: DNS detect, TLS handshake, HTTP head
* bundle export to zip
---
## 3) Dependency map (crate graph)
**High-level dependency graph:**
```
wtfnet-cli
├─ wtfnet-core
├─ wtfnet-platform
│ ├─ wtfnet-platform-linux (cfg linux)
│ └─ wtfnet-platform-windows (cfg windows)
├─ wtfnet-geoip
├─ wtfnet-probe
├─ wtfnet-dns
├─ wtfnet-http
├─ wtfnet-tls
├─ wtfnet-discover
└─ wtfnet-diag
```
**Rule of thumb**
* `wtfnet-core` should not depend on OS crates.
* feature crates should depend on `wtfnet-core` + minimal extras.
---
## 4) Shared libraries / crate dependencies (recommended)
### 4.1 Core stack (almost everywhere)
* **CLI**: `clap` (+ `clap_complete` optional)
* **Async runtime**: `tokio`
* **Serialization**: `serde`, `serde_json`
* **Errors**: `thiserror` (libs), `anyhow` (CLI glue)
* **URLs**: `url`
* **Time**: `time` (or `chrono`), `humantime`
* **Tables (human output)**: `tabled` or `comfy-table`
* **Zip bundles**: `zip` (diag bundle)
### 4.2 Logging / tracing
Use `tracing` + `tracing-subscriber`:
* `tracing`
* `tracing-subscriber` (fmt + env filter)
* optional: `tracing-appender` (log file)
**Why this choice**
* structured logs
* spans for timing each probe stage
* easy JSON logs to stderr
---
### 4.3 Sys/platform related crates
* Interfaces: `network-interface` ([Crates][10]) (good for standardized interface listing)
* Linux netlink: `rtnetlink` ([Docs.rs][1])
* Ports:
* `listeners` (cross-platform “listening process mapping”) ([GitHub][11])
* fallback: `netstat2` (cross-platform sockets info) ([Docs.rs][12])
* Windows APIs: `windows` crate
> Suggestion: start with `listeners` for `ports listen/who` (it directly targets your use-case). ([GitHub][11])
---
### 4.4 GeoIP
* `maxminddb` (read GeoLite2 mmdb)
---
### 4.5 DNS
* `hickory-resolver` / Hickory ecosystem ([Docs.rs][4])
Passive `dns watch` (optional):
* `pcap` or `pnet` (feature-gated)
---
### 4.6 HTTP / TLS
HTTP:
* `reqwest` (easy HTTP/1.1 + HTTP/2)
* or `hyper` if you want lower-level control
HTTP/3 (optional feature):
* `h3` + `h3-quinn` + `quinn` ([GitHub][5])
TLS:
* `rustls`
* `rustls-native-certs` ([Crates][6])
* `rustls-platform-verifier` (optional) ([GitHub][7])
---
### 4.7 Probing
* ping:
* `surge-ping` ([Crates][2])
* or `ping-async` ([Docs.rs][3])
* tcping: plain `tokio::net::TcpStream::connect` + `timeout`
* trace: start simple (UDP/ICMP-based approaches are trickier & permission-sensitive)
---
### 4.8 Discovery
* mDNS: `mdns-sd` ([Crates][8])
* SSDP: `ssdp-client` ([Crates][9])
---
## 5) Feature flags & compile-time options
In root design, define optional features to avoid heavy deps by default.
Suggested features:
* `http3` → enables `h3`, `h3-quinn`, `quinn`
* `pcap` → enables passive DNS watch with packet capture
* `discover` → enable mdns/ssdp features (if you want a “minimal build”)
Example snippet (in `wtfnet-http/Cargo.toml`):
```toml
[features]
default = ["http2"]
http2 = []
http3 = ["dep:h3", "dep:h3-quinn", "dep:quinn"]
[dependencies]
reqwest = { version = "*", features = ["rustls-tls", "http2"] }
h3 = { version = "*", optional = true }
h3-quinn = { version = "*", optional = true }
quinn = { version = "*", optional = true }
```
---
## 6) Core data model + output schema (do this early)
### 6.1 Unified JSON wrapper (recommended)
Every command returns:
```rust
pub struct CommandEnvelope<T> {
pub meta: Meta,
pub command: CommandInfo,
pub data: T,
pub warnings: Vec<Warn>,
pub errors: Vec<ErrItem>,
}
```
Key principles:
* stable keys
* additive schema evolution
* logs never pollute stdout JSON
### 6.2 Exit code mapping
Put this in `wtfnet-core` and make CLI enforce it:
```rust
pub enum ExitKind {
Ok,
Failed,
Usage,
Permission,
Timeout,
Partial,
}
```
---
## 7) Logging design (`tracing`)
### 7.1 Init once in `main()`
`wtfnet-core::logging::init(...)` should:
* respect CLI flags + env vars
* print to stderr
* support `text|json` formats
### 7.2 Use spans for “timing breakdown”
Example: HTTP diagnostics
```rust
#[tracing::instrument(skip(client))]
async fn http_head(client: &Client, url: &Url) -> Result<HttpReport, Error> {
let _span = tracing::info_span!("http_head", %url).entered();
// measure dns/connect/tls/ttfb where possible
Ok(report)
}
```
---
## 8) Platform abstraction pattern (Rust traits)
In `wtfnet-platform`:
```rust
pub trait SysProvider {
async fn interfaces(&self) -> Result<Vec<NetInterface>, PlatformError>;
async fn routes(&self) -> Result<Vec<RouteEntry>, PlatformError>;
async fn dns_config(&self) -> Result<DnsConfig, PlatformError>;
}
pub trait PortsProvider {
async fn listening(&self) -> Result<Vec<ListenSocket>, PlatformError>;
async fn who_owns(&self, port: u16) -> Result<Vec<ListenSocket>, PlatformError>;
}
pub trait CertProvider {
async fn trusted_roots(&self) -> Result<Vec<RootCert>, PlatformError>;
}
pub trait NeighProvider {
async fn neighbors(&self) -> Result<Vec<NeighborEntry>, PlatformError>;
}
```
Then provide:
```rust
pub struct Platform {
pub sys: Arc<dyn SysProvider + Send + Sync>,
pub ports: Arc<dyn PortsProvider + Send + Sync>,
pub cert: Arc<dyn CertProvider + Send + Sync>,
pub neigh: Arc<dyn NeighProvider + Send + Sync>,
}
```
OS dispatch:
* `cfg(target_os = "linux")``wtfnet-platform-linux`
* `cfg(target_os = "windows")``wtfnet-platform-windows`
---
## 9) Implementation notes per command area
### 9.1 sys (interfaces/routes/dns)
**Interfaces**
* start with `network-interface` ([Crates][10]) for a normalized list
* if you need MTU / gateway details not exposed, add platform-native calls
**Linux routes/neigh**
* `rtnetlink` can manage links/addresses/ARP/route tables ([Docs.rs][1])
**Windows routes/neigh**
* use Win32 IP helper APIs via `windows` crate
---
### 9.2 ports (listen/who)
Best path:
* use `listeners` crate for cross-platform “listening sockets → process” mapping ([GitHub][11])
Fallback:
* `netstat2` provides cross-platform socket information ([Docs.rs][12])
---
### 9.3 cert roots
Use:
* `rustls-native-certs` to load roots from OS trust store ([Crates][6])
Filtering:
* by subject substring
* by fingerprint sha256
* expired/not-yet-valid
Diff:
* export stable JSON
* compare by fingerprint (sha256)
---
### 9.4 geoip (local mmdb)
Use:
* `maxminddb`
Expose:
* `GeoIpService::new(country_path, asn_path)`
* `lookup(IpAddr) -> GeoInfo`
---
### 9.5 dns (query/detect/watch)
Resolver:
* Hickory DNS ecosystem ([Docs.rs][4])
Detect logic (keep deterministic):
* Query multiple resolvers
* Normalize answers (A/AAAA/CNAME)
* “suspicious” if major divergence, NXDOMAIN mismatch, private IP injection patterns
Watch:
* feature-gate packet capture
* document privilege needs clearly
---
### 9.6 http (head/get)
HTTP/2:
* `reqwest` makes this trivial
HTTP/3 (optional):
* use `h3` + `h3-quinn` + `quinn` ([GitHub][5])
Keep it behind `--http3` and fallback to HTTP/2 when UDP is blocked.
Timing breakdown:
* youll get total time easily
* fine-grained DNS/connect/TLS timing may need deeper client hooks (ok to be best-effort)
---
### 9.7 tls (handshake/cert/verify)
Handshake:
* use `rustls` to connect and extract:
* version, cipher suite, ALPN
* peer cert chain
Verify:
* use `rustls-platform-verifier` if you want OS-like verification ([GitHub][7])
* otherwise load roots via `rustls-native-certs` ([Crates][6]) and verify with webpki
---
### 9.8 neigh (ARP/NDP)
Linux:
* `rtnetlink` includes ARP/neighbor operations ([Docs.rs][1])
Windows:
* IP Helper API provides neighbor cache info (implementation detail)
---
### 9.9 discover (mDNS + SSDP)
mDNS:
* `mdns-sd` ([Crates][8])
Bounded `--duration`, no spam.
SSDP:
* `ssdp-client` ([Crates][9])
Send M-SEARCH, collect responses, parse location/server/usn.
---
## 10) Testing strategy
### 10.1 Unit tests (fast, pure)
* subnet math (`calc`)
* parsing/formatting
* DNS comparison heuristics (test vectors)
### 10.2 Snapshot tests (JSON stability)
Use `insta`:
* ensure `--json` schema doesnt drift accidentally
### 10.3 Integration tests (CI)
* run non-privileged commands only:
* `sys ifaces`
* `calc subnet`
* `dns query example.com A`
* `http head https://example.com`
---
## 11) Coding conventions
* Every command handler returns a structured `CommandEnvelope<T>`
* Never `println!` from libs; return data → CLI prints it
* `--json` must be clean stdout (no logs mixed in)
* Use timeouts everywhere in probe/dns/http/tls
* Prefer “best-effort + warnings” over hard failure
---
## 12) Minimal “first coding milestone” plan
1. `wtfnet-core`: envelope + logging init + exit mapping
2. `wtfnet-cli`: clap skeleton + `sys ifaces`
3. `wtfnet-geoip`: load mmdb + `geoip <ip>`
4. `ports listen/who` using `listeners` ([GitHub][11])
5. `dns query` via Hickory ([Docs.rs][4])
6. `http head` and `tls handshake` basic success path
7. `diag` orchestration + zip bundle
---