From 46513354c0149e719c7faa81f2f1ce0a3ea37aca Mon Sep 17 00:00:00 2001 From: xah30 Date: Mon, 25 May 2026 18:40:19 +0300 Subject: [PATCH] docs: add protocol, PKI, and split-tunnel documentation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit docs/protocol.md, docs/pki.md, docs/split-tunnel.md — written from the actual implementation (pinned handshake order, ML-KEM-768/FIPS 203, seq||AEAD records with replay window, QUIC/H3 mimicry) including honest v1 limitations. Co-Authored-By: Claude Opus 4.7 --- docs/pki.md | 232 +++++++++++++++++++++++++++ docs/protocol.md | 371 +++++++++++++++++++++++++++++++++++++++++++ docs/split-tunnel.md | 241 ++++++++++++++++++++++++++++ 3 files changed, 844 insertions(+) create mode 100644 docs/pki.md create mode 100644 docs/protocol.md create mode 100644 docs/split-tunnel.md diff --git a/docs/pki.md b/docs/pki.md new file mode 100644 index 0000000..e99252c --- /dev/null +++ b/docs/pki.md @@ -0,0 +1,232 @@ +# Aura PKI + +Aura uses a small, self-contained X.509 PKI for **mutual authentication** of the inner +handshake. A single self-signed Aura **CA** issues one **server** certificate and one +**client** certificate per client. During the handshake the client verifies the server's +certificate and the server verifies the client's certificate, both against the CA. + +The PKI is implemented in the `aura-pki` crate (`ca.rs`, `cert.rs`, `store.rs`) and exposed on +the command line as `aura pki ...` (`crates/aura-cli/src/pki.rs`, +`crates/aura-cli/src/main.rs`). + +> The outer QUIC/TLS layer does **not** use this PKI — it accepts any certificate (see +> `protocol.md`, "Mimicry layer"). All certificate trust lives in the inner Aura handshake. + +--- + +## Trust model + +``` + Aura CA (self-signed) + CN = , isCA, keyCertSign/crlSign + | + +------------+------------+ + | | + server leaf client leaf(s) + CN = CN = + SAN: DNS: (no SAN) + EKU: serverAuth EKU: clientAuth +``` + +- The **CA** is self-signed with `BasicConstraints: CA`, and key usages + `keyCertSign` + `crlSign` + `digitalSignature`. Default lifetime **3650 days**. +- A **server leaf** carries `CN = `, a **`DNS:` SAN**, and + `extendedKeyUsage = serverAuth`. The DNS SAN is what the client matches against its expected + `server_name`. +- A **client leaf** carries `CN = ` and `extendedKeyUsage = clientAuth`. The CN is + the identity the server learns and records as the session `peer_id`. +- Leaf key usages are `digitalSignature` + `keyEncipherment`. Default lifetime **365 days**. +- All issued certs (CA and leaves) backdate `not_before` by **5 minutes** to tolerate clock + skew. + +### Algorithms + +All keys are **ECDSA P-256 / SHA-256** (rcgen's default `KeyPair::generate`). Private keys are +written in **PKCS#8 PEM**. Chain verification (in `cert.rs`) accepts ECDSA P-256/SHA-256 +(required), and also ECDSA P-384/SHA-384 and Ed25519, so a deployment can switch key types +later without code changes. + +--- + +## File layout + +The CLI keeps files in plain directories. Conventional names +(`crates/aura-cli/src/pki.rs`): + +| File | Constant | Contents | +|---------------|------------|-------------------------------------------| +| `ca.crt` | `CA_CERT` | CA certificate (PEM) | +| `ca.key` | `CA_KEY` | CA private key (PKCS#8 PEM) — **secret** | +| `server.crt` | | Server leaf certificate (PEM) | +| `server.key` | | Server leaf private key (PEM) — **secret**| +| `client.crt` | | Client leaf certificate (PEM) | +| `client.key` | | Client leaf private key (PEM) — **secret**| +| `revoked.crl` | `CRL_FILE` | Revocation list (one identifier per line) | + +`issue-server` and `issue-client` load the CA from `ca.crt` + `ca.key` in the CA directory and +write `server.{crt,key}` / `client.{crt,key}` into the output directory. Paths beginning with +`~` are expanded to the home directory (from `$HOME`, or `$USERPROFILE` on Windows). + +These names map directly onto the `[pki]` section of `server.toml` / `client.toml` +(`ca_cert`, `cert`, `key`). + +--- + +## `aura pki` commands + +``` +aura pki init --ca-name --out +aura pki issue-server --domain --out [--ca ] +aura pki issue-client --id --out [--ca ] +aura pki revoke --id [--crl ] +aura pki list [--crl ] +``` + +For `issue-server` / `issue-client`, `--ca` defaults to the value of `--out` (so the CA and +the issued leaf can live in the same directory). For `revoke` / `list`, `--crl` defaults to +`./revoked.crl`. + +### `init` — create a CA + +Generates a fresh self-signed CA and writes `ca.crt` + `ca.key` into `--out` (creating the +directory if needed). + +```bash +aura pki init --ca-name "Aura Root CA" --out ~/.aura +# CA generated: +# cert: ~/.aura/ca.crt +# key: ~/.aura/ca.key +``` + +### `issue-server` — issue a server certificate + +Issues a server leaf for a DNS name, signed by the CA, with a `DNS:` SAN and +`serverAuth` EKU. + +```bash +aura pki issue-server --domain vpn.example.com --out ~/.aura --ca ~/.aura +# server certificate issued for 'vpn.example.com': +# cert: ~/.aura/server.crt +# key: ~/.aura/server.key +``` + +> The `--domain` must equal the name the client expects in the handshake. In the shipped +> client config that name is taken from `[client] sni`, so the camouflage SNI and the +> verified server SAN are the same value. + +### `issue-client` — issue a client certificate + +Issues a client leaf with `CN = ` and `clientAuth` EKU. The `` becomes the verified +`peer_id` the server sees. + +```bash +aura pki issue-client --id laptop --out ~/.aura --ca ~/.aura +# client certificate issued for 'laptop': +# cert: ~/.aura/client.crt +# key: ~/.aura/client.key +``` + +### `revoke` — add to the revocation list + +Adds an identifier — a **client id / Common Name** or a **certificate serial** (lowercase +hex, no separators) — to the CRL file, creating it (and parent directories) if absent. + +```bash +aura pki revoke --id laptop --crl ~/.aura/revoked.crl +# revoked 'laptop' (CRL: ~/.aura/revoked.crl) +``` + +### `list` — show revoked identifiers + +Prints the identifiers in the CRL file (empty if the file does not exist). + +```bash +aura pki list --crl ~/.aura/revoked.crl +# revoked identifiers (CRL: ~/.aura/revoked.crl): +# laptop +``` + +### End-to-end example + +```bash +# 1. Create the CA. +aura pki init --ca-name "Aura Root CA" --out ~/.aura + +# 2. Issue the server cert for its public DNS name. +aura pki issue-server --domain vpn.example.com --out ~/.aura + +# 3. Issue a client cert per device. +aura pki issue-client --id laptop --out ~/.aura + +# 4. (later) Revoke a compromised client. +aura pki revoke --id laptop +``` + +--- + +## Verification + +Verification is performed by `AuraCertVerifier` (`crates/aura-pki/src/cert.rs`), built from +the CA certificate PEM. It uses **`rustls-webpki`** to validate the peer's leaf against the CA +trust anchor. The Aura handshake invokes it on each side (see `protocol.md`). + +**Server certificate** (`verify_server_cert`), run by the client: + +1. webpki chain verification against the CA with key usage **`serverAuth`**, plus validity + (time) check. +2. The leaf must be valid for the requested `server_name` (DNS SAN match); a mismatch is + `NameMismatch`. +3. CRL check (see below). + +**Client certificate** (`verify_client_cert`), run by the server: + +1. webpki chain verification against the CA with key usage **`clientAuth`**, plus validity. +2. The **client id** is extracted as the first Common Name from the leaf subject (missing CN + is `MissingIdentity`). +3. CRL check. +4. Returns the client id, which the handshake records as the session `peer_id`. + +The leaf certificate is sent **inline** in the handshake (DER, no intermediate chain); the CA +is the single trust anchor. Possession of the leaf's private key is proven separately by the +handshake signature over the transcript (see `protocol.md`). + +Errors surface as `PkiError`: `CertParse`, `EmptyChain`, `TrustAnchor`, `Verification`, +`NameMismatch`, `MissingIdentity`, `Revoked`. + +--- + +## Revocation (CRL) + +Aura v1 revocation is deliberately minimal (`crates/aura-pki/src/store.rs`). `CrlStore` is a +**set of revoked identifier strings**, where an identifier is either: + +- a certificate **serial number** (lowercase hex, no separators), or +- a **client id / Common Name**. + +During verification, if the CRL is non-empty the leaf is rejected (`Revoked`) when **either** +its serial **or** its Common Name is present in the set. An empty CRL skips the check +entirely. + +The on-disk format is one identifier per line; blank lines and `#` comments are ignored on +load. `aura pki revoke` / `aura pki list` manage this file. + +> v1 limitation: this is a flat allow/deny set, not a signed X.509 CRL. There is no CRL +> signature, no `nextUpdate`, and no automatic distribution — the file must be provisioned to +> the verifying side out of band. The verifier passes `None` for webpki's own revocation +> hooks and relies solely on this set. + +--- + +## Security notes + +- **Protect the private keys.** `ca.key` is the root of all trust; anyone with it can mint + valid server/client certs. `server.key` / `client.key` must stay on their respective hosts. + The CLI writes them with default file permissions — restrict them at the OS level. +- **The CA is self-signed and unconstrained** (`BasicConstraints: CA` unconstrained). It is + the sole trust anchor; there is no intermediate CA tier in v1. +- **Server identity is name-bound.** The client only accepts a server leaf whose DNS SAN + matches the expected name, so a different valid leaf from the same CA will not be accepted + for the wrong host. +- **Revocation is best-effort** (see above): plan to distribute the CRL file and keep it in + sync on every server that verifies clients. +- **Leaf lifetime is 365 days**; plan re-issuance. There is no automated rotation in v1. diff --git a/docs/protocol.md b/docs/protocol.md new file mode 100644 index 0000000..99345b5 --- /dev/null +++ b/docs/protocol.md @@ -0,0 +1,371 @@ +# Aura Protocol + +The Aura protocol provides a mutually-authenticated, post-quantum-secure tunnel between a +client and a server. It is implemented in the `aura-proto` crate on top of `aura-crypto` +(hybrid KEM, HKDF, AEAD) and `aura-pki` (mutual X.509 verification). + +This document is for an engineer auditing or reimplementing the protocol. Everything below +reflects the **actual implementation**, not an idealized spec. Where the original spec was +ambiguous (notably the handshake message order), the implementation pins an exact choice and +that pinned choice is what is documented here. + +## Layering + +``` ++-------------------------------------------------------------+ +| Application IP packets (TUN) | ++-------------------------------------------------------------+ +| Aura inner session: Frame -> AEAD-sealed Data record | <- real security boundary +| Aura inner handshake: hybrid KEM + mutual X.509 | ++-------------------------------------------------------------+ +| Outer QUIC/TLS (quinn + rustls) — MIMICRY ONLY | <- NOT a security boundary +| ALPN h3 / h3-29, Chrome-like transport params, | +| client accepts ANY server cert | ++-------------------------------------------------------------+ +| UDP | ++-------------------------------------------------------------+ +``` + +The two layers have very different jobs: + +- **Outer QUIC/TLS** is camouflage. It is configured to look like ordinary browser HTTP/3 + traffic. It performs **no** meaningful authentication — see [Mimicry layer](#mimicry-layer). +- **Inner Aura handshake/session** is the real security boundary: hybrid post-quantum key + agreement plus mutual certificate verification against the Aura CA, then an AEAD-protected + record stream with replay protection. + +The inner protocol is transport-agnostic: `client_handshake` / `server_handshake` are generic +over a separate `tokio::io::AsyncRead` reader and `AsyncWrite` writer, so the same code drives +an in-memory duplex pipe (tests) and quinn's split `RecvStream` / `SendStream` (the QUIC +transport) identically. + +--- + +## Wire format + +Every Aura protocol message is a **5-byte header** followed by a payload +(`crates/aura-proto/src/frame.rs`): + +``` +byte 0 : msg_type (u8) +bytes 1..4 : length (u24, big-endian) = payload length in bytes +byte 4 : version = 0x01 +bytes 5.. : payload (length bytes) +``` + +- `length` is a 24-bit big-endian integer, so the maximum payload is `0x00FF_FFFF` + (16 MiB − 1). An oversize payload is rejected with `FrameTooLarge`. +- `version` is `0x01`. A header whose byte 4 is not `0x01` is rejected with `BadVersion`. + +### Message types + +| Byte | `MsgType` | Direction | Encrypted | Role | +|--------|---------------|-----------|-----------|--------------------------------------------| +| `0x01` | `ClientHello` | C→S | no | Handshake 1: hybrid public key + nonce | +| `0x02` | `ServerHello` | S→C | no | Handshake 2: hybrid ciphertext + nonce | +| `0x03` | `ClientAuth` | C→S | yes | Handshake 4: client cert + signature | +| `0x04` | `ServerAuth` | S→C | yes | Handshake 3: server cert + signature | +| `0x05` | `Finished` | both | yes | Handshake 5/6: HMAC over the transcript | +| `0x06` | `Data` | both | yes | Application record (AEAD-sealed `Frame`) | +| `0xFF` | `Alert` | both | no | Fatal alert; payload byte 0 is the code | + +> Note: the numeric byte values do **not** follow the send order. `ServerAuth` (`0x04`) is +> sent *before* `ClientAuth` (`0x03`). The send order is fixed by the state machine +> (below), not by the type byte. + +### Application frames + +Once the session is established, the application payload carried inside each encrypted `Data` +record is a `Frame` (`crates/aura-proto/src/frame.rs`). All multi-byte integers are +big-endian: + +| Frame | Tag | Encoding | +|---------|--------|-----------------------------------------------------| +| `Data` | `0x01` | `0x01 \|\| stream_id(u32) \|\| payload` | +| `Ping` | `0x02` | `0x02 \|\| seq(u32)` | +| `Pong` | `0x03` | `0x03 \|\| seq(u32)` | +| `Close` | `0x04` | `0x04 \|\| code(u8) \|\| reason_len(u32) \|\| reason_utf8` | + +--- + +## Handshake + +### Pinned message order + +The original spec diagram was ambiguous about the order of the encrypted auth/Finished +messages. The implementation pins this exact order, and both peers follow it lock-step +(`crates/aura-proto/src/handshake.rs`): + +``` +1. C -> S ClientHello (plaintext): x25519_pub[32] || mlkem_ek[1184] || client_nonce[32] +2. S -> C ServerHello (plaintext): x25519_ephemeral[32] || mlkem_ct[1088] || server_nonce[32] + -- both sides derive the hybrid shared secret and the two directional SessionKeys -- +3. S -> C ServerAuth (encrypted under s2c): u16(cert_der_len) || server_leaf_cert_der || sig(transcript) +4. C -> S ClientAuth (encrypted under c2s): u16(cert_der_len) || client_leaf_cert_der || sig(transcript) +5. C -> S Finished (encrypted under c2s): HMAC-SHA256(key_c2s, transcript) +6. S -> C Finished (encrypted under s2c): HMAC-SHA256(key_s2c, transcript) + -- encrypted Data channel is now open in both directions -- +``` + +```mermaid +sequenceDiagram + participant C as Client + participant S as Server + Note over C,S: plaintext + C->>S: 1. ClientHello (x25519_pub, mlkem_ek, client_nonce) + S->>C: 2. ServerHello (x25519_eph, mlkem_ct, server_nonce) + Note over C,S: both derive shared secret + SessionKeys
transcript = SHA-256(CH_frame || SH_frame) + Note over C,S: encrypted (AEAD under directional keys) + S->>C: 3. ServerAuth (server cert + sig over transcript) + C->>S: 4. ClientAuth (client cert + sig over transcript) + C->>S: 5. Finished (HMAC_c2s over transcript) + S->>C: 6. Finished (HMAC_s2c over transcript) + Note over C,S: session established; Data records flow both ways +``` + +### Hello payloads (exact sizes) + +| Field | ClientHello | ServerHello | Bytes | +|-------------------|:-----------:|:-----------:|------:| +| X25519 pub / eph | ✔ | ✔ | 32 | +| ML-KEM-768 ek | ✔ | | 1184 | +| ML-KEM-768 ct | | ✔ | 1088 | +| nonce | ✔ | ✔ | 32 | +| **Total payload** | **1248** | **1152** | | + +Hellos are sent in plaintext and validated for exact length on receipt; a wrong length is +rejected with `MalformedHandshake`. + +### Transcript hash + +``` +transcript = SHA-256( ClientHello_frame_bytes || ServerHello_frame_bytes ) +``` + +The hash covers the **full serialized frames** (5-byte header + payload) of ClientHello and +ServerHello, exactly as transmitted on the wire. This binds the negotiated key material and +the protocol version into both the signatures and the Finished MACs. + +### Authentication (ServerAuth / ClientAuth) + +Each Auth payload is: + +``` +u16_be(cert_der_len) || leaf_cert_der || signature +``` + +- `leaf_cert_der` is the sender's **leaf certificate** in DER (sent inline; no chain — the + CA is the trust anchor on the receiving side). +- `signature` is an **ECDSA P-256 / SHA-256** signature, ASN.1 DER encoded + (`ECDSA_P256_SHA256_ASN1`), computed over the 32-byte `transcript` (via `ring`). + +Verification (`crates/aura-proto/src/handshake.rs`): + +1. The receiver builds an `AuraCertVerifier` from its configured CA PEM and verifies the + peer's leaf against the CA (chain + key-usage + validity; see `pki.md`). + - The **client** additionally requires the server leaf to be valid for the expected + `server_name` (DNS SAN match). + - The **server** captures the verified **client id** (leaf Common Name) and stores it as + the session's `peer_id`. +2. The receiver extracts the leaf's EC public-key point and verifies `signature` over + `transcript`. A failure is `Signature(...)`. + +Possession of the certificate's private key is therefore proven by the signature over the +transcript; the certificate identity is proven by the CA chain check. + +### Finished + +Each side sends, then verifies, a Finished MAC bound to the transcript and the direction key: + +``` +Finished_c2s = HMAC-SHA256(key_c2s, transcript) // client sends (msg 5), server verifies +Finished_s2c = HMAC-SHA256(key_s2c, transcript) // server sends (msg 6), client verifies +``` + +Verification is constant-time (`Hmac::verify_slice`); a mismatch is `FinishedMismatch`. The +Finished exchange confirms both sides derived identical keys and agree on the full transcript. + +### Encrypted handshake messages and counter continuity + +Messages 3–6 are AEAD-sealed under the **same** two directional `AeadSession`s that protect +application Data; their nonce counters are continuous across the handshake/data boundary. + +- The AAD for each encrypted handshake message is its 5-byte frame header (binding type + + length), matching the Data-record convention. +- Each direction seals **exactly two** encrypted handshake messages before Data begins: + - c2s seals `ClientAuth` (counter 0) and `Finished` (counter 1) + - s2c seals `ServerAuth` (counter 0) and `Finished` (counter 1) +- Therefore both directions reach AEAD counter **2** at the end of the handshake, and the + first application Data record stamps `seq == 2` (`POST_HANDSHAKE_COUNTER`). This seeds the + replay window (below). + +--- + +## Hybrid KEM + +The key exchange is a hybrid of classical X25519 ECDH and post-quantum ML-KEM-768 +(`crates/aura-crypto/src/kem/`). An attacker must break **both** primitives to recover the +session key. + +> **ML-KEM-768 (FIPS 203)**, via the RustCrypto `ml-kem` crate (v0.3) — this is the +> standardized FIPS 203 scheme, **not** round-3 Kyber. + +### Roles + +- The **client** owns the long-term `HybridPrivateKey` and publishes its `HybridPublicKey` + in ClientHello. +- The **server** calls `encapsulate()` against that public key: it generates an **ephemeral** + X25519 keypair and an ML-KEM encapsulation, returns the `HybridCiphertext` in ServerHello, + and derives the shared secret. +- The **client** recovers the same secret via `decapsulate()`. + +So X25519 is **ephemeral–static** (server ephemeral against client static public), while +ML-KEM is a standard KEM against the client's encapsulation key. + +### Sizes + +| Quantity | Bytes | Constant | +|-----------------------------------|------:|---------------------| +| X25519 public / ephemeral / secret| 32 | `X25519_LEN` | +| ML-KEM-768 encapsulation key (ek) | 1184 | `EK_LEN` | +| ML-KEM-768 ciphertext (ct) | 1088 | `CT_LEN` | +| ML-KEM-768 shared secret | 32 | `SS_LEN` | +| ML-KEM-768 decapsulation key (dk) | 2400 | `DK_LEN` | + +> **Implementation detail — dk encoding.** The decapsulation (secret) key is stored in the +> FIPS 203 **expanded 2400-byte** form (`ExpandedKeyEncoding`), not the 64-byte seed that +> `ml-kem` 0.3 prefers. This is the encoding the project's ACVP / FIPS-203 known-answer test +> vectors operate on, so it is used for interop/KAT compatibility. The dk never travels on the +> wire — only `ek` (1184 B) and `ct` (1088 B) do. + +### Combined shared secret + +``` +shared = x25519_ss (32 B) || mlkem_ss (32 B) // 64 bytes total +``` + +ML-KEM decapsulation is infallible on a correctly sized ciphertext: a tampered ciphertext +yields a pseudo-random secret (implicit rejection) rather than an error, which surfaces later +as an AEAD/Finished failure. + +--- + +## Key derivation (HKDF) + +Directional session keys are derived with **HKDF-SHA256** (RFC 5869) +(`crates/aura-crypto/src/kdf.rs`): + +``` +salt = client_nonce || server_nonce (64 bytes) +IKM = x25519_ss || mlkem_ss (64 bytes) +info = "aura-v1-session" +OKM = HKDF-Expand(HKDF-Extract(salt, IKM), info, 64) (64 bytes) + +key_client_to_server = OKM[0..32] +key_server_to_client = OKM[32..64] +``` + +The derivation is fully deterministic in its inputs. The `info` string provides domain +separation. Intermediate secret material (`salt`, `IKM`, `OKM`) is zeroized after use, and +`SessionKeys` zeroizes its keys on drop. + +--- + +## AEAD + +The record cipher is **ChaCha20-Poly1305** (`crates/aura-crypto/src/aead.rs`). An +`AeadSession` holds a 256-bit key and a 64-bit message counter; each direction has its own +session. + +### Nonce scheme + +The 96-bit (12-byte) nonce is derived from the counter: + +``` +nonce[0..8] = counter as little-endian u64 +nonce[8..12] = 0x00 00 00 00 +``` + +The counter advances by one on every `seal` **and** every `open` (even on a failed `open`), +so a paired seal/open stay aligned without transmitting the nonce. The nonce is never reused +within a session (the 2^64 counter wrap is unreachable; an overflow panics rather than +reusing a nonce). The key is zeroized on drop. + +--- + +## Data records and replay protection + +After the handshake, application `Frame`s are exchanged as `Data` records +(`crates/aura-proto/src/session.rs`). Each `Data` record's **payload** is: + +``` +seq (u64, big-endian) || ChaCha20Poly1305_seal( frame_bytes, aad = header || seq ) +``` + +- `seq` is the 8-byte big-endian record counter. On the happy path it equals the sealing + AEAD's counter (and the receiver's expected AEAD counter). +- The AEAD **AAD** is the 5-byte frame `header` concatenated with the 8-byte `seq`, so the + record is cryptographically bound to both its declared length/type and its claimed position. +- The ciphertext includes the 16-byte Poly1305 tag. + +So the full record on the wire is: + +``` +[ header(5) ][ seq(8) ][ ciphertext + tag ] +\_____________________________________________/ + header.length = 8 + len(ciphertext+tag) +``` + +### Sliding replay window + +The receiver runs a **64-wide sliding-window** replay check (`REPLAY_WINDOW = 64`) *before* +touching the AEAD, so a duplicate or too-old record is rejected with `Replay(seq)` without +disturbing the AEAD counter (the session stays usable). The window: + +- tracks the highest accepted `seq` plus a 64-bit bitmap of accepted positions below it; +- accepts a `seq` iff it is strictly newer than everything seen, or falls within the window + and has not been seen before; +- rejects a `seq` that equals the current highest, is already marked in the bitmap, or is + more than `REPLAY_WINDOW` below the highest. + +The window is seeded at the post-handshake counter (`start = 2`): everything strictly below +`start` is treated as already-consumed, so the first legitimate Data record (`seq == 2`) is +accepted as "newer". + +### Full-duplex split + +A `Session` can be `split()` into independent `SessionSender` (writer + outbound AEAD + +send counter) and `SessionReceiver` (reader + inbound AEAD + replay window) halves, which can +be driven from separate tasks for a concurrent read/write data path (e.g. the VPN tunnel). +`recv_frame` is **not** cancellation-safe and must be driven from a single owning task. + +--- + +## Mimicry layer + +The outer QUIC/TLS layer (`crates/aura-transport/`) exists purely to disguise the connection +as browser HTTP/3 traffic. It is explicitly **not** the authentication boundary. + +- **ALPN** advertises `h3` and `h3-29` (`ALPN_H3`) — exactly what Chrome offers for HTTP/3 — + so the ALPN extension is indistinguishable from a real browser's. +- **Transport params** mirror a Chromium HTTP/3 connection: ~30 s idle timeout, ~15 s + keep-alive, 100 concurrent bidi/uni streams, ~10 MB flow-control receive windows + (`chrome_quic_transport_config`). +- **SNI** defaults to a generic CDN-looking hostname (`cdn.example.com`) when the caller does + not supply one; deployments pass their own camouflage hostname. +- The QUIC **client accepts any server certificate** (`AcceptAnyServerCert` — all verifier + methods return success). This is safe *only* because the outer TLS is not authentication: + the real mutual auth is the inner Aura handshake. The server's outer TLS likewise disables + client auth (`with_no_client_auth`). + +> Do not reuse `AcceptAnyServerCert` anywhere the TLS layer *is* the authentication boundary. + +--- + +## Error model + +The protocol layer surfaces `ProtoError` (`crates/aura-proto/src/lib.rs`), including: +`Io`, `Crypto`, `Pki`, `UnknownMsgType`, `BadVersion`, `FrameTooLarge`, `UnexpectedMsg`, +`MalformedHandshake`, `MalformedFrame`, `Signature`, `FinishedMismatch`, `Replay`, and +`Alert`. A peer may send a fatal `Alert` frame (type `0xFF`); the first payload byte is the +alert code, surfaced to the local side as `ProtoError::Alert(code)`. diff --git a/docs/split-tunnel.md b/docs/split-tunnel.md new file mode 100644 index 0000000..fa444fc --- /dev/null +++ b/docs/split-tunnel.md @@ -0,0 +1,241 @@ +# Aura Split Tunnel + +Split tunneling decides, per destination IP, whether a packet travels **through the encrypted +VPN** or **egresses directly** (bypassing the tunnel). It lets you keep, say, RFC1918 LAN +traffic local while sending the rest through Aura — or the reverse. + +It is implemented in the `aura-tunnel` crate (`routes.rs`, `router.rs`, `dns.rs`), configured +statically via the `[tunnel.split]` section of `client.toml` +(`crates/aura-cli/src/config.rs`), and managed live via the `aura route` / `aura status` +admin commands (`crates/aura-cli/src/admin.rs`). + +--- + +## Concept: VPN vs DIRECT + +Every outbound IP packet read from the TUN device is classified into one of two actions +(`RouteAction`): + +- **`Vpn`** — encrypt and send the packet over the Aura connection to the server. +- **`Direct`** — let the packet egress directly, bypassing the tunnel. + +The router (`AuraRouter::run`, `router.rs`) parses each packet's destination IP, classifies +it, and dispatches: + +``` +TUN read --> parse dst IP --> RouteTable.classify(dst) --> Vpn? -> conn.send_packet() + \ Direct? -> send_direct() (v1 stub) +``` + +> **v1 limitation — `Direct` is a stub.** `send_direct` currently **logs and drops** the +> packet; real raw-socket / OS-stack re-injection is out of scope for v1. The method is +> already `async` and fallible so a real egress path can slot in without changing call sites. +> The VPN path is fully functional end-to-end. Packets whose destination cannot be parsed +> (not IPv4/IPv6, or too short) are dropped with a trace. + +The inbound direction is straightforward: decrypted IP packets received from the peer are +written back to the TUN device. + +--- + +## Rules + +The routing table (`RouteTable`, `routes.rs`) holds three things: a set of **CIDR rules**, a +set of **domain rules**, and a **default action**. + +### CIDR rules + +A CIDR rule is an `IpNetwork` (e.g. `10.0.0.0/8`) plus an action. CIDR rules are keyed by +network, so re-adding the same network **overwrites** its action. + +### Domain rules + +A domain rule is a domain name plus an action. Domains do **not** match IPs directly. Instead +`AuraDns` (`dns.rs`) resolves the domain via the system resolver (hickory) and inserts each +resulting address as a **host route** — `/32` for IPv4, `/128` for IPv6 — so it participates +in the normal longest-prefix match. Resolution results are cached. + +> Because domain rules become host routes at resolution time, they only take effect once the +> domain has been resolved (at startup, or on demand). They reflect the addresses seen at +> resolution time and are not continuously re-resolved in v1. + +### Default action + +If no CIDR rule (including resolved domain host routes) matches a destination, the table's +**default action** applies. + +--- + +## Longest-prefix precedence + +`classify(dst_ip)` performs a **longest-prefix match** (`routes.rs`): + +> Among all CIDR rules whose network contains the destination, the rule with the **largest +> prefix length** (most specific) wins. If no rule matches, the default action is returned. + +This lets a specific range override a broader one regardless of insertion order. IPv4 rules +only match IPv4 destinations and IPv6 rules only match IPv6 destinations. + +Example (from the shipped config): with `default = VPN`, `10.0.0.0/8 = Direct`, and +`10.7.0.0/24 = Vpn`: + +| Destination | Matched rule | Action | +|--------------|----------------------|--------| +| `10.1.2.3` | `10.0.0.0/8` | Direct | +| `10.7.0.9` | `10.7.0.0/24` (more specific, wins over `/8`) | Vpn | +| `192.168.1.1`| `192.168.0.0/16` | Direct | +| `8.8.8.8` | (none) → default | Vpn | + +> Edge case: if two rules share the **same** prefix length, the **last-inserted** one wins +> (it overwrites the earlier entry, since rules are keyed by network). + +--- + +## Static config: `[tunnel.split]` + +The split tunnel is configured in `client.toml` under `[tunnel.split]` +(`crates/aura-cli/src/config.rs`). `build_route_table` turns it into a `RouteTable`: CIDR +rules are applied directly; domain rules are recorded and returned for the client to resolve +at startup. + +### Schema + +| Key | Type | Default | Meaning | +|------------------------------|-----------------|---------|----------------------------------------------------| +| `default` | string | `"VPN"` | Action when no rule matches: `VPN` / `DIRECT` (case-insensitive) | +| `[[tunnel.split.direct]]` | array of rules | `[]` | Rules forcing matching destinations to **Direct** | +| `[[tunnel.split.vpn]]` | array of rules | `[]` | Rules forcing matching destinations through the **VPN** | + +Each rule in `direct` / `vpn` is a table with **exactly one** of: + +| Key | Type | Example | +|----------|--------|---------------------| +| `cidr` | string | `"192.168.0.0/16"` | +| `domain` | string | `"intranet.example.com"` | + +A rule with both `cidr` and `domain`, or neither, is rejected when the route table is built. + +### Example + +```toml +# Split-tunnel routing: the default action plus per-destination overrides. +[tunnel.split] +# Default for destinations matching no rule below: "VPN" or "DIRECT". +default = "VPN" + +# Send these directly (bypass the tunnel): RFC1918 ranges stay on the LAN... +[[tunnel.split.direct]] +cidr = "192.168.0.0/16" + +[[tunnel.split.direct]] +cidr = "10.0.0.0/8" + +# ...and a corporate domain egresses directly (resolved to host routes at startup). +[[tunnel.split.direct]] +domain = "intranet.example.com" + +# Force a more-specific range back through the VPN (longest-prefix wins over 10.0.0.0/8). +[[tunnel.split.vpn]] +cidr = "10.7.0.0/24" +``` + +This is the configuration shipped in `config/client.toml.example`. + +--- + +## Live management: `aura route` / `aura status` + +A running `aura client` (or `aura server`) hosts an **admin socket** — a tiny JSON +line-protocol over a **Unix domain socket** (`crates/aura-cli/src/admin.rs`). The `aura +route` and `aura status` subcommands connect to it to inspect and mutate the live routing +table without restarting the tunnel. The default socket path is `/tmp/aura-admin.sock` +(override with `--admin-socket`). + +> Platform note: the admin socket uses Unix domain sockets (Linux/macOS). On Windows it is a +> `cfg`-gated stub that returns an explanatory error (a named-pipe transport is future work), +> so the rest of the CLI still compiles there. + +### Commands + +``` +aura route add (--cidr | --domain ) --action [--admin-socket ] +aura route list [--admin-socket ] +aura route remove --cidr [--admin-socket ] +aura status [--admin-socket ] +``` + +`route add` takes **exactly one** of `--cidr` / `--domain` (they are mutually exclusive, and +one is required), plus `--action vpn` or `--action direct`. + +```bash +# Send a CIDR directly, live. +aura route add --cidr 8.8.8.0/24 --action direct +# ok + +# Route a domain through the VPN (resolved into host routes). +aura route add --domain example.com --action vpn +# ok + +# Inspect the current rules and default. +aura route list +# default: vpn +# cidr 8.8.8.0/24 direct +# domain example.com vpn + +# Remove a CIDR rule. +aura route remove --cidr 8.8.8.0/24 +# ok (removed) # or: "ok (nothing to remove)" if it wasn't present + +# Tunnel status / counters. +aura status +# Aura tunnel status +# peer: client-1 +# default: vpn +# rules: 1 +# rx packets: 0 +# tx packets: 0 +``` + +### Behavior notes + +- **`route remove` only removes CIDR rules** — it takes `--cidr` and has no domain form. The + library `RouteTable` has no per-rule remove API, so a removal **rebuilds** the table from + the surviving rules (preserving the default). Domain rules are re-added on rebuild, but + their previously resolved host routes are dropped and re-resolved on demand. +- **`route list` enumerates a rule mirror.** The live `RouteTable` is the source of truth for + classification but does not expose iteration, so the admin layer keeps a parallel mirror in + lockstep with every mutation; `list` echoes that mirror while `classify` still uses the real + table. +- **`status`** reports the verified peer id, the default action, the total rule count + (CIDR + domain), and inbound/outbound packet counters. + +### Wire protocol (for reference) + +One JSON object per line, request then response (`crates/aura-cli/src/admin.rs`): + +```text +-> {"cmd":"route_add","cidr":"8.8.8.0/24","action":"direct"} +<- {"ok":true} +-> {"cmd":"route_list"} +<- {"ok":true,"default":"vpn","cidrs":[{"cidr":"8.8.8.0/24","action":"direct"}],"domains":[]} +-> {"cmd":"route_remove","cidr":"8.8.8.0/24"} +<- {"ok":true,"removed":true} +-> {"cmd":"status"} +<- {"ok":true,"peer_id":"client-1","rx_packets":0,"tx_packets":0,"default":"vpn","rules":1} +``` + +On error the response is `{"ok":false,"error":"..."}`. + +--- + +## v1 limitations summary + +- **`Direct` egress is a stub** — `Direct` packets are logged and dropped, not re-injected to + the OS stack. The VPN path is fully functional. +- **Domain rules are resolved once** (at startup / on demand) into host routes; no continuous + re-resolution. +- **`route remove` is CIDR-only** and rebuilds the table (domain host routes are re-resolved + on demand afterward). +- **Admin socket is Unix-only**; Windows is a `cfg`-gated stub. +- The server is a **single shared TUN** in v1, and the tunnel resolver `dns` config field is + informational (the system resolver is used).