feat(transport,cli,tunnel): v3.4 port auto-detect + bug fixes from live test
Live macOS test against the production server uncovered six bugs (one of which turned out to be a port collision with sing-box, not a real bug); this commit addresses all of them and adds v3.4 port discovery so the same collision is handled transparently next time. ## v3.4 server port-discovery - Defaults moved off 443/444 to 8443/8443/8444 (TransportSection::default, ServerInitOpts, ProvisionClientOpts, CLI flags). 443 is heavily contested in practice (sing-box, Hysteria2, reverse proxies) and the previous default silently lost the bind when a co-tenant was already there. - MultiServer::bind_with_outer_or_scan: scans forward up to DEFAULT_PORT_SCAN_MAX (20) candidates per transport when the requested port is occupied; QUIC keeps walking if it lands on the custom-UDP port. - MultiServer::bound_addrs(): the actual addresses each transport bound to. - Server logs the bound addresses and writes a runtime snapshot (server.toml.runtime.json) when they differ from the requested ones, so `aura sign-bridges` can re-sign the bridges manifest later. - BridgeManifest gains an optional `endpoints: Vec<BridgeEndpoint>` field with per-transport ports. Backward-compatible: old v3.3 clients ignore the field and continue to use the v1 `bridges` line. - `aura sign-bridges --endpoints HOST:tcp=N:quic=N:udp=N` to mint v3.4 manifests; bridges line is auto-synthesised for v3.3 clients. ## Bug fixes from the live test - macOS TUN naming (#41): the tun crate rejects names that don't match ^utun[0-9]+$. On macOS we now substitute `""` (kernel auto-assigns utunN), capture the assigned name via inner.tun_name(), and propagate it through to os_routes::OsRouteGuard::install — so `route add -interface utunN` uses the real interface, not "aura0". - Packet counters (#42): Stats { tx_packets, rx_packets } are now actually bumped by the data path. `aura status` shows live numbers instead of permanent zeros. - render_client_toml schema (#44): provisioner emits proper `[[tunnel.split.vpn]] cidr = "..."` / `[[tunnel.split.direct]]` blocks from new --vpn-cidrs / --direct-cidrs flags. The v3.3 `vpn_cidrs = [...]` flat array was silently ignored by serde, leaving users with `rules: 0` even when their CIDRs looked right. - #43 / #46 (TCP/443 dial early-eof / no payload back): diagnosed as the sing-box port collision, not an Aura bug. The v3.4 port-scan path makes it go away — the server picks a free port and clients learn it from the manifest. ## Test coverage Three new unit tests for the port-scanner (UDP busy, TCP busy, zero budget); two new tests for v3.4 BridgeManifest round-trip with endpoints; one integration test for the new `[[tunnel.split.vpn]]` rendering; tests for the runtime-state file write/read round-trip; agent-added router-counter tests in aura-tunnel/tests/routes.rs. cargo test --workspace, cargo clippy --workspace -- -D warnings, and cargo fmt --check all pass. #45 (silent client exit when underlying QUIC transport breaks) is still outstanding — needs deeper investigation; deferred to a follow-up. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -193,8 +193,16 @@ pub struct MultiServer {
|
||||
/// Live TCP server handle (shared with the accept loop), used by the mask rotator to update
|
||||
/// the accept-time options. `None` when the TCP transport was not enabled.
|
||||
tcp: Option<Arc<TcpServer>>,
|
||||
/// v3.4: actual bound addresses for each transport. Differs from the originally requested
|
||||
/// `Endpoints` when [`Self::bind_with_outer_or_scan`] had to walk past a busy port. Empty
|
||||
/// (`None`) for transports that were disabled or failed to bind.
|
||||
bound: Endpoints,
|
||||
}
|
||||
|
||||
/// v3.4: default port-scan budget. When a transport's requested port is occupied,
|
||||
/// [`MultiServer::bind_with_outer_or_scan`] walks forward this many candidates before giving up.
|
||||
pub const DEFAULT_PORT_SCAN_MAX: u16 = 20;
|
||||
|
||||
impl MultiServer {
|
||||
/// Bind and start accept loops for every transport whose address is set in `endpoints`.
|
||||
/// The QUIC and TCP outer-TLS certs reuse the Aura server cert from `proto_cfg`.
|
||||
@@ -251,10 +259,12 @@ impl MultiServer {
|
||||
|
||||
let (txc, rx) = mpsc::channel::<Accepted>(32);
|
||||
let mut tasks = Vec::new();
|
||||
let mut bound = Endpoints::default();
|
||||
|
||||
let udp_handle = if let Some(addr) = endpoints.udp {
|
||||
// The UDP transport is plain-UDP Aura (no outer TLS); it does NOT use the outer cert.
|
||||
let server = Arc::new(UdpServer::bind(addr, proto_cfg.clone(), udp)?);
|
||||
bound.udp = server.local_addr().ok();
|
||||
tasks.push(tokio::spawn(udp_accept_loop(
|
||||
Arc::clone(&server),
|
||||
txc.clone(),
|
||||
@@ -271,6 +281,7 @@ impl MultiServer {
|
||||
}
|
||||
None => TcpServer::bind(addr, proto_cfg.clone(), tcp.clone()).await?,
|
||||
});
|
||||
bound.tcp = server.local_addr().ok();
|
||||
tasks.push(tokio::spawn(tcp_accept_loop(
|
||||
Arc::clone(&server),
|
||||
txc.clone(),
|
||||
@@ -289,6 +300,7 @@ impl MultiServer {
|
||||
),
|
||||
};
|
||||
let server = AuraServer::bind(addr, oc, ok, proto_cfg.clone())?;
|
||||
bound.quic = server.local_addr().ok();
|
||||
tasks.push(tokio::spawn(quic_accept_loop(server, txc.clone())));
|
||||
}
|
||||
|
||||
@@ -300,9 +312,119 @@ impl MultiServer {
|
||||
tasks,
|
||||
udp: udp_handle,
|
||||
tcp: tcp_handle,
|
||||
bound,
|
||||
})
|
||||
}
|
||||
|
||||
/// v3.4: like [`Self::bind_with_outer`], but if any transport's requested port is occupied
|
||||
/// (returns `io::ErrorKind::AddrInUse`), scan forward up to `max_scan` candidates per
|
||||
/// transport before failing. The actually-bound addresses are recorded in [`Self::bound_addrs`]
|
||||
/// — they often differ from `endpoints` when the host has e.g. sing-box on the original port.
|
||||
///
|
||||
/// The UDP transport and QUIC must end up on different ports (both use UDP); if the scan
|
||||
/// drives them into a collision, the second one keeps walking. TCP can share a port number
|
||||
/// with either since it is a different protocol.
|
||||
///
|
||||
/// Per-transport policy:
|
||||
/// * **Fatal bind error** (anything other than `AddrInUse`, or `AddrInUse` past the scan
|
||||
/// budget) bubbles up and aborts the server — keeping behaviour consistent with v3.3.
|
||||
/// * **No fallback for transports that were `None`** — they stay disabled.
|
||||
///
|
||||
/// # Errors
|
||||
/// Same as [`Self::bind_with_outer`] after the scan-resolved endpoints are computed.
|
||||
pub async fn bind_with_outer_or_scan(
|
||||
mut endpoints: Endpoints,
|
||||
proto_cfg: ServerConfig,
|
||||
udp: UdpOpts,
|
||||
tcp: TcpOpts,
|
||||
outer_cert_pem: Option<&str>,
|
||||
outer_key_pem: Option<&str>,
|
||||
max_scan: u16,
|
||||
) -> anyhow::Result<Self> {
|
||||
// Pre-probe each transport's port. We use raw std::net binds (with SO_REUSEADDR is the
|
||||
// OS default off-state on macOS/Linux) to test availability, drop the probe, and pass the
|
||||
// resolved port to the real bind. There is a microsecond race window between drop and
|
||||
// real bind; for a non-malicious environment that's acceptable, and the real bind will
|
||||
// simply return AddrInUse if hit (caller can re-run the scan).
|
||||
if let Some(addr) = endpoints.udp {
|
||||
let resolved = scan_free_udp_port(addr, max_scan).ok_or_else(|| {
|
||||
anyhow::anyhow!(
|
||||
"no free UDP port in {}..{} for Aura custom-UDP transport",
|
||||
addr.port(),
|
||||
addr.port().saturating_add(max_scan)
|
||||
)
|
||||
})?;
|
||||
if resolved != addr {
|
||||
tracing::warn!(
|
||||
requested = %addr,
|
||||
actual = %resolved,
|
||||
"UDP transport: requested port busy, scanned forward and picked a free one"
|
||||
);
|
||||
}
|
||||
endpoints.udp = Some(resolved);
|
||||
}
|
||||
if let Some(addr) = endpoints.quic {
|
||||
// QUIC must not collide with the custom-UDP port; if it does, start scanning from
|
||||
// the next port.
|
||||
let start = match endpoints.udp {
|
||||
Some(udp_addr) if udp_addr.ip() == addr.ip() && udp_addr.port() == addr.port() => {
|
||||
SocketAddr::new(addr.ip(), addr.port().saturating_add(1))
|
||||
}
|
||||
_ => addr,
|
||||
};
|
||||
let resolved = scan_free_udp_port(start, max_scan).ok_or_else(|| {
|
||||
anyhow::anyhow!(
|
||||
"no free UDP port in {}..{} for QUIC outer transport",
|
||||
start.port(),
|
||||
start.port().saturating_add(max_scan)
|
||||
)
|
||||
})?;
|
||||
if resolved != addr {
|
||||
tracing::warn!(
|
||||
requested = %addr,
|
||||
actual = %resolved,
|
||||
"QUIC transport: requested port busy, scanned forward and picked a free one"
|
||||
);
|
||||
}
|
||||
endpoints.quic = Some(resolved);
|
||||
}
|
||||
if let Some(addr) = endpoints.tcp {
|
||||
let resolved = scan_free_tcp_port(addr, max_scan).ok_or_else(|| {
|
||||
anyhow::anyhow!(
|
||||
"no free TCP port in {}..{} for TCP outer transport",
|
||||
addr.port(),
|
||||
addr.port().saturating_add(max_scan)
|
||||
)
|
||||
})?;
|
||||
if resolved != addr {
|
||||
tracing::warn!(
|
||||
requested = %addr,
|
||||
actual = %resolved,
|
||||
"TCP transport: requested port busy, scanned forward and picked a free one"
|
||||
);
|
||||
}
|
||||
endpoints.tcp = Some(resolved);
|
||||
}
|
||||
|
||||
Self::bind_with_outer(
|
||||
endpoints,
|
||||
proto_cfg,
|
||||
udp,
|
||||
tcp,
|
||||
outer_cert_pem,
|
||||
outer_key_pem,
|
||||
)
|
||||
.await
|
||||
}
|
||||
|
||||
/// v3.4: the addresses each enabled transport actually bound to. After
|
||||
/// [`Self::bind_with_outer_or_scan`], these may differ from the requested `Endpoints` if a
|
||||
/// port had to be walked past a conflict. Transports that were not enabled remain `None`.
|
||||
#[must_use]
|
||||
pub fn bound_addrs(&self) -> &Endpoints {
|
||||
&self.bound
|
||||
}
|
||||
|
||||
/// Update the UDP accept-time options. The next [`Self::accept`] of a UDP connection will use
|
||||
/// the new options; existing connections keep theirs. No-op if the UDP transport is disabled.
|
||||
pub async fn set_udp_opts(&self, new_opts: UdpOpts) {
|
||||
@@ -326,6 +448,42 @@ impl MultiServer {
|
||||
}
|
||||
}
|
||||
|
||||
/// Try `start.port()`, `start.port()+1`, ..., `start.port()+max_scan` until a UDP bind succeeds.
|
||||
/// Returns the resolved [`SocketAddr`]; `None` if no candidate was free within the budget.
|
||||
fn scan_free_udp_port(start: SocketAddr, max_scan: u16) -> Option<SocketAddr> {
|
||||
let mut port = start.port();
|
||||
let upper = port.saturating_add(max_scan);
|
||||
while port <= upper {
|
||||
let cand = SocketAddr::new(start.ip(), port);
|
||||
if std::net::UdpSocket::bind(cand).is_ok() {
|
||||
return Some(cand);
|
||||
}
|
||||
// Overflow guard: port is u16, saturating_add(1) caps at u16::MAX without wrap.
|
||||
if port == u16::MAX {
|
||||
return None;
|
||||
}
|
||||
port += 1;
|
||||
}
|
||||
None
|
||||
}
|
||||
|
||||
/// Try `start.port()`, `start.port()+1`, ..., `start.port()+max_scan` until a TCP bind succeeds.
|
||||
fn scan_free_tcp_port(start: SocketAddr, max_scan: u16) -> Option<SocketAddr> {
|
||||
let mut port = start.port();
|
||||
let upper = port.saturating_add(max_scan);
|
||||
while port <= upper {
|
||||
let cand = SocketAddr::new(start.ip(), port);
|
||||
if std::net::TcpListener::bind(cand).is_ok() {
|
||||
return Some(cand);
|
||||
}
|
||||
if port == u16::MAX {
|
||||
return None;
|
||||
}
|
||||
port += 1;
|
||||
}
|
||||
None
|
||||
}
|
||||
|
||||
impl Drop for MultiServer {
|
||||
fn drop(&mut self) {
|
||||
for t in &self.tasks {
|
||||
@@ -399,3 +557,44 @@ async fn quic_accept_loop(server: AuraServer, tx: mpsc::Sender<Accepted>) {
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod port_scan_tests {
|
||||
use super::*;
|
||||
|
||||
/// When the requested port is occupied, the scan walks forward and returns a port within
|
||||
/// the budget. We hold a real socket to simulate the busy condition.
|
||||
#[test]
|
||||
fn udp_scan_skips_busy_port() {
|
||||
// Start from an OS-assigned free port, then re-bind to the same port and start scanning
|
||||
// from there — the scanner must skip the busy port and find a free neighbour.
|
||||
let blocker = std::net::UdpSocket::bind("127.0.0.1:0").expect("bind blocker");
|
||||
let busy_addr = blocker.local_addr().expect("local_addr");
|
||||
let resolved = scan_free_udp_port(busy_addr, 10).expect("scan must find a free port");
|
||||
assert_ne!(resolved.port(), busy_addr.port(), "must skip the busy port");
|
||||
assert!(resolved.port() > busy_addr.port());
|
||||
assert!(resolved.port() <= busy_addr.port() + 10);
|
||||
drop(blocker);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn tcp_scan_skips_busy_port() {
|
||||
let blocker = std::net::TcpListener::bind("127.0.0.1:0").expect("bind blocker");
|
||||
let busy_addr = blocker.local_addr().expect("local_addr");
|
||||
let resolved = scan_free_tcp_port(busy_addr, 10).expect("scan must find a free port");
|
||||
assert_ne!(resolved.port(), busy_addr.port());
|
||||
assert!(resolved.port() > busy_addr.port());
|
||||
assert!(resolved.port() <= busy_addr.port() + 10);
|
||||
drop(blocker);
|
||||
}
|
||||
|
||||
/// With a zero scan budget, a busy port yields `None` (no walk, no luck).
|
||||
#[test]
|
||||
fn scan_with_zero_budget_returns_none_on_busy_port() {
|
||||
let blocker = std::net::UdpSocket::bind("127.0.0.1:0").expect("bind blocker");
|
||||
let busy_addr = blocker.local_addr().expect("local_addr");
|
||||
let resolved = scan_free_udp_port(busy_addr, 0);
|
||||
assert_eq!(resolved, None);
|
||||
drop(blocker);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -72,7 +72,9 @@ pub mod tcp;
|
||||
pub mod udp;
|
||||
|
||||
pub use conn::AuraConnection;
|
||||
pub use dial::{dial, Accepted, DialConfig, Endpoints, MultiServer, TransportMode};
|
||||
pub use dial::{
|
||||
dial, Accepted, DialConfig, Endpoints, MultiServer, TransportMode, DEFAULT_PORT_SCAN_MAX,
|
||||
};
|
||||
pub use mimicry::{alpn_protocols, chrome_quic_transport_config, ALPN_H3, DEFAULT_SNI};
|
||||
pub use padding::{
|
||||
inject_padding_frames, next_bucket_for_profile, pad_to_bucket, pad_to_https_size,
|
||||
|
||||
Reference in New Issue
Block a user