docs: add protocol, PKI, and split-tunnel documentation

docs/protocol.md, docs/pki.md, docs/split-tunnel.md — written from the actual
implementation (pinned handshake order, ML-KEM-768/FIPS 203, seq||AEAD records
with replay window, QUIC/H3 mimicry) including honest v1 limitations.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
xah30
2026-05-25 18:40:19 +03:00
parent cb89312a27
commit 46513354c0
3 changed files with 844 additions and 0 deletions
+371
View File
@@ -0,0 +1,371 @@
# Aura Protocol
The Aura protocol provides a mutually-authenticated, post-quantum-secure tunnel between a
client and a server. It is implemented in the `aura-proto` crate on top of `aura-crypto`
(hybrid KEM, HKDF, AEAD) and `aura-pki` (mutual X.509 verification).
This document is for an engineer auditing or reimplementing the protocol. Everything below
reflects the **actual implementation**, not an idealized spec. Where the original spec was
ambiguous (notably the handshake message order), the implementation pins an exact choice and
that pinned choice is what is documented here.
## Layering
```
+-------------------------------------------------------------+
| Application IP packets (TUN) |
+-------------------------------------------------------------+
| Aura inner session: Frame -> AEAD-sealed Data record | <- real security boundary
| Aura inner handshake: hybrid KEM + mutual X.509 |
+-------------------------------------------------------------+
| Outer QUIC/TLS (quinn + rustls) — MIMICRY ONLY | <- NOT a security boundary
| ALPN h3 / h3-29, Chrome-like transport params, |
| client accepts ANY server cert |
+-------------------------------------------------------------+
| UDP |
+-------------------------------------------------------------+
```
The two layers have very different jobs:
- **Outer QUIC/TLS** is camouflage. It is configured to look like ordinary browser HTTP/3
traffic. It performs **no** meaningful authentication — see [Mimicry layer](#mimicry-layer).
- **Inner Aura handshake/session** is the real security boundary: hybrid post-quantum key
agreement plus mutual certificate verification against the Aura CA, then an AEAD-protected
record stream with replay protection.
The inner protocol is transport-agnostic: `client_handshake` / `server_handshake` are generic
over a separate `tokio::io::AsyncRead` reader and `AsyncWrite` writer, so the same code drives
an in-memory duplex pipe (tests) and quinn's split `RecvStream` / `SendStream` (the QUIC
transport) identically.
---
## Wire format
Every Aura protocol message is a **5-byte header** followed by a payload
(`crates/aura-proto/src/frame.rs`):
```
byte 0 : msg_type (u8)
bytes 1..4 : length (u24, big-endian) = payload length in bytes
byte 4 : version = 0x01
bytes 5.. : payload (length bytes)
```
- `length` is a 24-bit big-endian integer, so the maximum payload is `0x00FF_FFFF`
(16 MiB 1). An oversize payload is rejected with `FrameTooLarge`.
- `version` is `0x01`. A header whose byte 4 is not `0x01` is rejected with `BadVersion`.
### Message types
| Byte | `MsgType` | Direction | Encrypted | Role |
|--------|---------------|-----------|-----------|--------------------------------------------|
| `0x01` | `ClientHello` | C→S | no | Handshake 1: hybrid public key + nonce |
| `0x02` | `ServerHello` | S→C | no | Handshake 2: hybrid ciphertext + nonce |
| `0x03` | `ClientAuth` | C→S | yes | Handshake 4: client cert + signature |
| `0x04` | `ServerAuth` | S→C | yes | Handshake 3: server cert + signature |
| `0x05` | `Finished` | both | yes | Handshake 5/6: HMAC over the transcript |
| `0x06` | `Data` | both | yes | Application record (AEAD-sealed `Frame`) |
| `0xFF` | `Alert` | both | no | Fatal alert; payload byte 0 is the code |
> Note: the numeric byte values do **not** follow the send order. `ServerAuth` (`0x04`) is
> sent *before* `ClientAuth` (`0x03`). The send order is fixed by the state machine
> (below), not by the type byte.
### Application frames
Once the session is established, the application payload carried inside each encrypted `Data`
record is a `Frame` (`crates/aura-proto/src/frame.rs`). All multi-byte integers are
big-endian:
| Frame | Tag | Encoding |
|---------|--------|-----------------------------------------------------|
| `Data` | `0x01` | `0x01 \|\| stream_id(u32) \|\| payload` |
| `Ping` | `0x02` | `0x02 \|\| seq(u32)` |
| `Pong` | `0x03` | `0x03 \|\| seq(u32)` |
| `Close` | `0x04` | `0x04 \|\| code(u8) \|\| reason_len(u32) \|\| reason_utf8` |
---
## Handshake
### Pinned message order
The original spec diagram was ambiguous about the order of the encrypted auth/Finished
messages. The implementation pins this exact order, and both peers follow it lock-step
(`crates/aura-proto/src/handshake.rs`):
```
1. C -> S ClientHello (plaintext): x25519_pub[32] || mlkem_ek[1184] || client_nonce[32]
2. S -> C ServerHello (plaintext): x25519_ephemeral[32] || mlkem_ct[1088] || server_nonce[32]
-- both sides derive the hybrid shared secret and the two directional SessionKeys --
3. S -> C ServerAuth (encrypted under s2c): u16(cert_der_len) || server_leaf_cert_der || sig(transcript)
4. C -> S ClientAuth (encrypted under c2s): u16(cert_der_len) || client_leaf_cert_der || sig(transcript)
5. C -> S Finished (encrypted under c2s): HMAC-SHA256(key_c2s, transcript)
6. S -> C Finished (encrypted under s2c): HMAC-SHA256(key_s2c, transcript)
-- encrypted Data channel is now open in both directions --
```
```mermaid
sequenceDiagram
participant C as Client
participant S as Server
Note over C,S: plaintext
C->>S: 1. ClientHello (x25519_pub, mlkem_ek, client_nonce)
S->>C: 2. ServerHello (x25519_eph, mlkem_ct, server_nonce)
Note over C,S: both derive shared secret + SessionKeys<br/>transcript = SHA-256(CH_frame || SH_frame)
Note over C,S: encrypted (AEAD under directional keys)
S->>C: 3. ServerAuth (server cert + sig over transcript)
C->>S: 4. ClientAuth (client cert + sig over transcript)
C->>S: 5. Finished (HMAC_c2s over transcript)
S->>C: 6. Finished (HMAC_s2c over transcript)
Note over C,S: session established; Data records flow both ways
```
### Hello payloads (exact sizes)
| Field | ClientHello | ServerHello | Bytes |
|-------------------|:-----------:|:-----------:|------:|
| X25519 pub / eph | ✔ | ✔ | 32 |
| ML-KEM-768 ek | ✔ | | 1184 |
| ML-KEM-768 ct | | ✔ | 1088 |
| nonce | ✔ | ✔ | 32 |
| **Total payload** | **1248** | **1152** | |
Hellos are sent in plaintext and validated for exact length on receipt; a wrong length is
rejected with `MalformedHandshake`.
### Transcript hash
```
transcript = SHA-256( ClientHello_frame_bytes || ServerHello_frame_bytes )
```
The hash covers the **full serialized frames** (5-byte header + payload) of ClientHello and
ServerHello, exactly as transmitted on the wire. This binds the negotiated key material and
the protocol version into both the signatures and the Finished MACs.
### Authentication (ServerAuth / ClientAuth)
Each Auth payload is:
```
u16_be(cert_der_len) || leaf_cert_der || signature
```
- `leaf_cert_der` is the sender's **leaf certificate** in DER (sent inline; no chain — the
CA is the trust anchor on the receiving side).
- `signature` is an **ECDSA P-256 / SHA-256** signature, ASN.1 DER encoded
(`ECDSA_P256_SHA256_ASN1`), computed over the 32-byte `transcript` (via `ring`).
Verification (`crates/aura-proto/src/handshake.rs`):
1. The receiver builds an `AuraCertVerifier` from its configured CA PEM and verifies the
peer's leaf against the CA (chain + key-usage + validity; see `pki.md`).
- The **client** additionally requires the server leaf to be valid for the expected
`server_name` (DNS SAN match).
- The **server** captures the verified **client id** (leaf Common Name) and stores it as
the session's `peer_id`.
2. The receiver extracts the leaf's EC public-key point and verifies `signature` over
`transcript`. A failure is `Signature(...)`.
Possession of the certificate's private key is therefore proven by the signature over the
transcript; the certificate identity is proven by the CA chain check.
### Finished
Each side sends, then verifies, a Finished MAC bound to the transcript and the direction key:
```
Finished_c2s = HMAC-SHA256(key_c2s, transcript) // client sends (msg 5), server verifies
Finished_s2c = HMAC-SHA256(key_s2c, transcript) // server sends (msg 6), client verifies
```
Verification is constant-time (`Hmac::verify_slice`); a mismatch is `FinishedMismatch`. The
Finished exchange confirms both sides derived identical keys and agree on the full transcript.
### Encrypted handshake messages and counter continuity
Messages 36 are AEAD-sealed under the **same** two directional `AeadSession`s that protect
application Data; their nonce counters are continuous across the handshake/data boundary.
- The AAD for each encrypted handshake message is its 5-byte frame header (binding type +
length), matching the Data-record convention.
- Each direction seals **exactly two** encrypted handshake messages before Data begins:
- c2s seals `ClientAuth` (counter 0) and `Finished` (counter 1)
- s2c seals `ServerAuth` (counter 0) and `Finished` (counter 1)
- Therefore both directions reach AEAD counter **2** at the end of the handshake, and the
first application Data record stamps `seq == 2` (`POST_HANDSHAKE_COUNTER`). This seeds the
replay window (below).
---
## Hybrid KEM
The key exchange is a hybrid of classical X25519 ECDH and post-quantum ML-KEM-768
(`crates/aura-crypto/src/kem/`). An attacker must break **both** primitives to recover the
session key.
> **ML-KEM-768 (FIPS 203)**, via the RustCrypto `ml-kem` crate (v0.3) — this is the
> standardized FIPS 203 scheme, **not** round-3 Kyber.
### Roles
- The **client** owns the long-term `HybridPrivateKey` and publishes its `HybridPublicKey`
in ClientHello.
- The **server** calls `encapsulate()` against that public key: it generates an **ephemeral**
X25519 keypair and an ML-KEM encapsulation, returns the `HybridCiphertext` in ServerHello,
and derives the shared secret.
- The **client** recovers the same secret via `decapsulate()`.
So X25519 is **ephemeralstatic** (server ephemeral against client static public), while
ML-KEM is a standard KEM against the client's encapsulation key.
### Sizes
| Quantity | Bytes | Constant |
|-----------------------------------|------:|---------------------|
| X25519 public / ephemeral / secret| 32 | `X25519_LEN` |
| ML-KEM-768 encapsulation key (ek) | 1184 | `EK_LEN` |
| ML-KEM-768 ciphertext (ct) | 1088 | `CT_LEN` |
| ML-KEM-768 shared secret | 32 | `SS_LEN` |
| ML-KEM-768 decapsulation key (dk) | 2400 | `DK_LEN` |
> **Implementation detail — dk encoding.** The decapsulation (secret) key is stored in the
> FIPS 203 **expanded 2400-byte** form (`ExpandedKeyEncoding`), not the 64-byte seed that
> `ml-kem` 0.3 prefers. This is the encoding the project's ACVP / FIPS-203 known-answer test
> vectors operate on, so it is used for interop/KAT compatibility. The dk never travels on the
> wire — only `ek` (1184 B) and `ct` (1088 B) do.
### Combined shared secret
```
shared = x25519_ss (32 B) || mlkem_ss (32 B) // 64 bytes total
```
ML-KEM decapsulation is infallible on a correctly sized ciphertext: a tampered ciphertext
yields a pseudo-random secret (implicit rejection) rather than an error, which surfaces later
as an AEAD/Finished failure.
---
## Key derivation (HKDF)
Directional session keys are derived with **HKDF-SHA256** (RFC 5869)
(`crates/aura-crypto/src/kdf.rs`):
```
salt = client_nonce || server_nonce (64 bytes)
IKM = x25519_ss || mlkem_ss (64 bytes)
info = "aura-v1-session"
OKM = HKDF-Expand(HKDF-Extract(salt, IKM), info, 64) (64 bytes)
key_client_to_server = OKM[0..32]
key_server_to_client = OKM[32..64]
```
The derivation is fully deterministic in its inputs. The `info` string provides domain
separation. Intermediate secret material (`salt`, `IKM`, `OKM`) is zeroized after use, and
`SessionKeys` zeroizes its keys on drop.
---
## AEAD
The record cipher is **ChaCha20-Poly1305** (`crates/aura-crypto/src/aead.rs`). An
`AeadSession` holds a 256-bit key and a 64-bit message counter; each direction has its own
session.
### Nonce scheme
The 96-bit (12-byte) nonce is derived from the counter:
```
nonce[0..8] = counter as little-endian u64
nonce[8..12] = 0x00 00 00 00
```
The counter advances by one on every `seal` **and** every `open` (even on a failed `open`),
so a paired seal/open stay aligned without transmitting the nonce. The nonce is never reused
within a session (the 2^64 counter wrap is unreachable; an overflow panics rather than
reusing a nonce). The key is zeroized on drop.
---
## Data records and replay protection
After the handshake, application `Frame`s are exchanged as `Data` records
(`crates/aura-proto/src/session.rs`). Each `Data` record's **payload** is:
```
seq (u64, big-endian) || ChaCha20Poly1305_seal( frame_bytes, aad = header || seq )
```
- `seq` is the 8-byte big-endian record counter. On the happy path it equals the sealing
AEAD's counter (and the receiver's expected AEAD counter).
- The AEAD **AAD** is the 5-byte frame `header` concatenated with the 8-byte `seq`, so the
record is cryptographically bound to both its declared length/type and its claimed position.
- The ciphertext includes the 16-byte Poly1305 tag.
So the full record on the wire is:
```
[ header(5) ][ seq(8) ][ ciphertext + tag ]
\_____________________________________________/
header.length = 8 + len(ciphertext+tag)
```
### Sliding replay window
The receiver runs a **64-wide sliding-window** replay check (`REPLAY_WINDOW = 64`) *before*
touching the AEAD, so a duplicate or too-old record is rejected with `Replay(seq)` without
disturbing the AEAD counter (the session stays usable). The window:
- tracks the highest accepted `seq` plus a 64-bit bitmap of accepted positions below it;
- accepts a `seq` iff it is strictly newer than everything seen, or falls within the window
and has not been seen before;
- rejects a `seq` that equals the current highest, is already marked in the bitmap, or is
more than `REPLAY_WINDOW` below the highest.
The window is seeded at the post-handshake counter (`start = 2`): everything strictly below
`start` is treated as already-consumed, so the first legitimate Data record (`seq == 2`) is
accepted as "newer".
### Full-duplex split
A `Session` can be `split()` into independent `SessionSender` (writer + outbound AEAD +
send counter) and `SessionReceiver` (reader + inbound AEAD + replay window) halves, which can
be driven from separate tasks for a concurrent read/write data path (e.g. the VPN tunnel).
`recv_frame` is **not** cancellation-safe and must be driven from a single owning task.
---
## Mimicry layer
The outer QUIC/TLS layer (`crates/aura-transport/`) exists purely to disguise the connection
as browser HTTP/3 traffic. It is explicitly **not** the authentication boundary.
- **ALPN** advertises `h3` and `h3-29` (`ALPN_H3`) — exactly what Chrome offers for HTTP/3 —
so the ALPN extension is indistinguishable from a real browser's.
- **Transport params** mirror a Chromium HTTP/3 connection: ~30 s idle timeout, ~15 s
keep-alive, 100 concurrent bidi/uni streams, ~10 MB flow-control receive windows
(`chrome_quic_transport_config`).
- **SNI** defaults to a generic CDN-looking hostname (`cdn.example.com`) when the caller does
not supply one; deployments pass their own camouflage hostname.
- The QUIC **client accepts any server certificate** (`AcceptAnyServerCert` — all verifier
methods return success). This is safe *only* because the outer TLS is not authentication:
the real mutual auth is the inner Aura handshake. The server's outer TLS likewise disables
client auth (`with_no_client_auth`).
> Do not reuse `AcceptAnyServerCert` anywhere the TLS layer *is* the authentication boundary.
---
## Error model
The protocol layer surfaces `ProtoError` (`crates/aura-proto/src/lib.rs`), including:
`Io`, `Crypto`, `Pki`, `UnknownMsgType`, `BadVersion`, `FrameTooLarge`, `UnexpectedMsg`,
`MalformedHandshake`, `MalformedFrame`, `Signature`, `FinishedMismatch`, `Replay`, and
`Alert`. A peer may send a fatal `Alert` frame (type `0xFF`); the first payload byte is the
alert code, surfaced to the local side as `ProtoError::Alert(code)`.