fix(kanban): recover and self-heal stuck agent runs

A run could get stranded at 'running' in the UI after a crash/disconnect/
restart, with no way to clear it. Root cause was a race: the SSE history
replay re-asserted a stale `running` status that beat the poll's settled
status, leaving the run showing "Running" + the settle error at once.

Server (runs.ts / runner.ts / index.ts):
- reconcile() on every read force-settles any 'running' run with no live
  runner, so the board self-heals on the next poll (≤3s) — no restart needed.
- forceSettle() emits a persisted `status` event so an open/reconnecting
  SSE stream replays the terminal state last, not a stale `running`.
- Startup orphan-reconciliation now also emits that event (was the gap that
  let the replay re-assert `running` after a server restart).
- Idle watchdog (10min): a silent pi is settled as 'failed' instead of
  hanging forever; SIGKILL escalation (20s) reaps wedged processes.
- stop() now recovers: active→abort, orphaned-but-running→force-stop
  (the Stop button clears wedged runs instead of 409'ing).
- start() catch force-settles 'failed' so a spawn failure never orphans a
  half-created 'running' row.

Client (useOrchestrator.ts):
- patchRun refuses to un-settle a terminal run, dropping stale replayed
  status as a belt-and-suspenders guard against any such race.
EOF && echo "" && git log --oneline -3

This commit is contained in:

francy51

2026-06-17 18:53:44 -04:00

parent 6531dc00df

commit 408bdb6dd7

5 changed files with 152 additions and 15 deletions

									
										2

apps/api/src/routes/orchestrator.ts
									
												View File
												
				@@ -87,7 +87,7 @@ orchestrator.post('/runs/:id/message', async (c) => {

				  }

				});

				/** Stop an active run. */

				/** Stop an active or wedged run (force-settles a run with no live process). */

				orchestrator.post('/runs/:id/stop', (c) => {

				  const id = c.req.param('id');

				  try {

fix(kanban): recover and self-heal stuck agent runs

2 apps/api/src/routes/orchestrator.ts Unescape Escape View File

2

apps/api/src/routes/orchestrator.ts

View File