feat(cli,aura-gui): v3.4.4 — graceful Shutdown via admin socket (#1)
Closes the long-standing "GUI Disconnect button doesn't actually kill aura"
bug. The previous kill path sent SIGTERM to sudo (our direct child) and
hoped sudo's signal forwarding would propagate to the aura child running
as root; in practice this is unreliable when the parent has no controlling
terminal (which Tauri-spawned children don't), so aura would survive the
"Disconnect" click with the TUN still up and the OS routes still installed.
## Implementation
Adds a `Shutdown` admin-socket request. The aura-cli main loops
(`client::run` and `server::run`) now `tokio::select!` between their normal
work (router.run() / accept loop) and a `tokio::sync::Notify` carried on
the shared `AdminState`. When an admin client posts `{"cmd":"shutdown"}`
the handler calls `state.shutdown.notify_one()`, the select! second arm
fires, the work future is dropped, `OsRouteGuard::Drop` rolls back the
installed system routes, and the process exits with `Ok(())` — clean exit
code 0, kernel reaps the TUN device, no orphan.
The whole round-trip is sub-500 ms in practice (the slow step is the
`route delete` invocations on macOS).
## What changed
* `aura-cli/src/admin.rs`: `Request::Shutdown` variant, `AdminState.shutdown:
Arc<Notify>` field, handler that calls `notify_one()` + returns `Response::ok()`.
* `aura-cli/src/client.rs`: clones `admin_state.shutdown` before spawning the
admin server task, then `tokio::select!`s between `router.run()` and
`shutdown.notified()`. Whichever finishes first ends the function; OsRouteGuard
Drop runs after.
* `aura-cli/src/server.rs`: same pattern around the `MultiServer::accept` loop —
graceful exit on admin Shutdown leaves the accept loop, breaks, and the
router_task is aborted on function return.
* `aura-cli/src/main.rs`: `aura shutdown --admin-socket <path>` subcommand for
CLI control (also useful from launchd/systemd post-stop hooks).
* `aura-gui/src-tauri/src/admin.rs`: new `send_shutdown(path)` helper; factored
out `round_trip()` for the common write-line + read-line pattern. Windows
stub returns "not implemented".
* `aura-gui/src-tauri/src/cli_proc.rs`: `ClientHandle::kill` now tries admin
Shutdown first (3 s poll for graceful exit), then SIGTERM to sudo (2 s),
then SIGKILL as last resort. The admin path needs no sudo because the
socket is already chmod 0666 from v3.4.1.
## Test
New `admin::tests::shutdown_request_fires_notify` unit test: spawns a
notified() waiter, calls `handle_request(Request::Shutdown)`, asserts the
waiter wakes within 200 ms. Combined with the existing 5 admin tests, all 6
pass.
`cargo test --workspace` — all green, `cargo clippy --workspace --all-targets
-- -D warnings` — clean.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -30,15 +30,54 @@ pub struct StatusResponse {
|
||||
|
||||
#[cfg(unix)]
|
||||
pub fn query_status(path: &str) -> Result<StatusResponse> {
|
||||
let line = round_trip(path, b"{\"cmd\":\"status\"}\n", Duration::from_millis(1500))?;
|
||||
let resp: StatusResponse = serde_json::from_str(&line)
|
||||
.with_context(|| format!("parsing admin response: {line}"))?;
|
||||
if !resp.ok {
|
||||
return Err(anyhow!(
|
||||
"admin returned error: {}",
|
||||
resp.error
|
||||
.clone()
|
||||
.unwrap_or_else(|| "(no error string)".into())
|
||||
));
|
||||
}
|
||||
Ok(resp)
|
||||
}
|
||||
|
||||
/// v3.4.4: send `{"cmd":"shutdown"}` over the admin socket. The running aura-cli sees the
|
||||
/// notification, breaks its router select! loop, and exits after `OsRouteGuard::Drop` rolls
|
||||
/// back the OS routes — no SIGTERM-through-sudo gymnastics needed (the admin socket is
|
||||
/// chmod 0666 so the GUI's desktop-user process can write to it directly).
|
||||
///
|
||||
/// Returns `Ok(())` on success; the caller is expected to wait briefly afterwards for the
|
||||
/// process to actually exit.
|
||||
#[cfg(unix)]
|
||||
pub fn send_shutdown(path: &str) -> Result<()> {
|
||||
let line = round_trip(path, b"{\"cmd\":\"shutdown\"}\n", Duration::from_millis(1500))?;
|
||||
// Reuse the StatusResponse shape — it has the `ok` / `error` fields we need, the rest are
|
||||
// None for a shutdown reply.
|
||||
let resp: StatusResponse = serde_json::from_str(&line)
|
||||
.with_context(|| format!("parsing admin response: {line}"))?;
|
||||
if !resp.ok {
|
||||
return Err(anyhow!(
|
||||
"shutdown rejected by admin: {}",
|
||||
resp.error
|
||||
.clone()
|
||||
.unwrap_or_else(|| "(no error string)".into())
|
||||
));
|
||||
}
|
||||
Ok(())
|
||||
}
|
||||
|
||||
#[cfg(unix)]
|
||||
fn round_trip(path: &str, request: &[u8], timeout: Duration) -> Result<String> {
|
||||
use std::os::unix::net::UnixStream;
|
||||
let mut sock =
|
||||
UnixStream::connect(path).with_context(|| format!("connecting to admin socket {path}"))?;
|
||||
sock.set_read_timeout(Some(Duration::from_millis(1500)))?;
|
||||
sock.set_write_timeout(Some(Duration::from_millis(1500)))?;
|
||||
sock.write_all(b"{\"cmd\":\"status\"}\n")?;
|
||||
sock.set_read_timeout(Some(timeout))?;
|
||||
sock.set_write_timeout(Some(timeout))?;
|
||||
sock.write_all(request)?;
|
||||
let mut buf = String::new();
|
||||
// The server writes one line + newline and closes the connection only when *we* close. We
|
||||
// need to read until newline. Use a small reader buffer.
|
||||
let mut tmp = [0u8; 1024];
|
||||
loop {
|
||||
let n = sock.read(&mut tmp)?;
|
||||
@@ -53,18 +92,9 @@ pub fn query_status(path: &str) -> Result<StatusResponse> {
|
||||
let line = buf
|
||||
.lines()
|
||||
.next()
|
||||
.ok_or_else(|| anyhow!("empty admin response"))?;
|
||||
let resp: StatusResponse =
|
||||
serde_json::from_str(line).with_context(|| format!("parsing admin response: {line}"))?;
|
||||
if !resp.ok {
|
||||
return Err(anyhow!(
|
||||
"admin returned error: {}",
|
||||
resp.error
|
||||
.clone()
|
||||
.unwrap_or_else(|| "(no error string)".into())
|
||||
));
|
||||
}
|
||||
Ok(resp)
|
||||
.ok_or_else(|| anyhow!("empty admin response"))?
|
||||
.to_string();
|
||||
Ok(line)
|
||||
}
|
||||
|
||||
#[cfg(windows)]
|
||||
@@ -76,3 +106,10 @@ pub fn query_status(_path: &str) -> Result<StatusResponse> {
|
||||
"admin socket query is not yet implemented on Windows; GUI status is process-only"
|
||||
))
|
||||
}
|
||||
|
||||
#[cfg(windows)]
|
||||
pub fn send_shutdown(_path: &str) -> Result<()> {
|
||||
Err(anyhow!(
|
||||
"admin shutdown is not yet implemented on Windows; the GUI falls back to SIGTERM"
|
||||
))
|
||||
}
|
||||
|
||||
@@ -60,27 +60,55 @@ impl ClientHandle {
|
||||
|
||||
/// Kill the child and reap it. Idempotent.
|
||||
///
|
||||
/// Because we spawned via `sudo -n aura …`, our direct child is `sudo` (running as us; we
|
||||
/// own it). The real aura process is sudo's child, running as root, so we can't signal it
|
||||
/// directly. SIGTERM to the sudo PID is forwarded to aura by sudo's signal handler, which
|
||||
/// lets aura's `OsRouteGuard::Drop` and TUN cleanup run before exit. After a 2 s grace
|
||||
/// period we fall back to SIGKILL via `Child::kill`, which kills sudo immediately (aura
|
||||
/// becomes orphaned, but the kernel reaps it via PID 1 — TUN may linger).
|
||||
/// v3.4.4 path — graceful via admin socket first. The aura admin socket is chmod 0666 (a
|
||||
/// fix from earlier in v3.4.x), so the GUI's desktop-user process can write to it without
|
||||
/// sudo. We send `{"cmd":"shutdown"}`, the aura main loop's `tokio::select!` fires its
|
||||
/// shutdown arm, `OsRouteGuard::Drop` rolls back system routes, then process exits.
|
||||
/// Typical exit is under 500 ms; we wait up to 3 s.
|
||||
///
|
||||
/// Fall-back: if the admin send fails (socket missing, aura already wedged), drop to the
|
||||
/// old SIGTERM-to-sudo path. Because we spawned via `sudo -n aura …`, our direct child is
|
||||
/// `sudo` running as us, and sudo forwards SIGTERM to the aura child by its own signal
|
||||
/// handler. SIGKILL via `Child::kill` is the absolute last resort — it leaves aura
|
||||
/// orphaned with the TUN still up.
|
||||
pub fn kill(self) -> Result<()> {
|
||||
let pid = { self.child.lock().id() };
|
||||
// SIGTERM to sudo — sudo forwards to aura. We own sudo so plain `kill` works.
|
||||
let sock = self.admin_socket.clone();
|
||||
|
||||
// 1. Try the admin-socket shutdown. Quiet on failure — we'll fall through.
|
||||
match crate::admin::send_shutdown(&sock) {
|
||||
Ok(()) => {
|
||||
// Poll for up to 3 s. Most exits land in well under 500 ms (the time
|
||||
// OsRouteGuard::Drop spends running `route delete …`).
|
||||
let mut guard = self.child.lock();
|
||||
for _ in 0..30 {
|
||||
if matches!(guard.try_wait(), Ok(Some(_))) {
|
||||
return Ok(());
|
||||
}
|
||||
thread::sleep(Duration::from_millis(100));
|
||||
}
|
||||
// Admin acked but the process is still alive — fall through to SIGTERM.
|
||||
}
|
||||
Err(_) => {
|
||||
// No admin response. Could be a stale socket from a previous, already-dead
|
||||
// session. Fall through.
|
||||
}
|
||||
}
|
||||
|
||||
// 2. SIGTERM to sudo, sudo forwards to aura.
|
||||
let _ = Command::new("kill")
|
||||
.arg("-TERM")
|
||||
.arg(pid.to_string())
|
||||
.output();
|
||||
let mut guard = self.child.lock();
|
||||
for _ in 0..20 {
|
||||
match guard.try_wait() {
|
||||
Ok(Some(_)) => return Ok(()),
|
||||
_ => thread::sleep(Duration::from_millis(100)),
|
||||
if matches!(guard.try_wait(), Ok(Some(_))) {
|
||||
return Ok(());
|
||||
}
|
||||
thread::sleep(Duration::from_millis(100));
|
||||
}
|
||||
// Grace period elapsed — fall back to SIGKILL.
|
||||
|
||||
// 3. SIGKILL — absolute last resort. Leaves aura orphaned but unblocks the UI.
|
||||
let _ = guard.kill();
|
||||
let _ = guard.wait();
|
||||
Ok(())
|
||||
|
||||
Reference in New Issue
Block a user