Autopilot — Supervised Autonomy
Agenties can run a goal's issues in sequence without waiting for you to issue each next command. This is supervised autonomy — not full automation. The orchestrator picks the next issue, spawns a builder agent, validates the result, and continues; but it stops at defined checkpoints so you remain in control.
Modes
| Mode | Behavior |
|---|---|
supervised-step | Runs exactly one issue, then pauses with a checkpoint. Resume when you are ready for the next one. This is the only mode available from the Goals panel today. |
continuous | Runs issues back-to-back until a stop condition is hit (budget, blocker, or 3 consecutive failures). When evidence is missing the issue is marked blocked and the run continues; it does not interactively pause for confirmation. Not yet exposed in the Goals panel UI — available via the IPC API only. |
Start a run from the Goals panel: select a goal that has open issues, then press Run Autopilot. The run always starts in supervised-step mode — one issue, then a checkpoint.
The observe → decide → execute → verify → persist loop
Each iteration of the autopilot loop follows these steps:
| Step | What happens |
|---|---|
| Observe | Read the execution plan and current spawn budget. If budget pressure is high, stop immediately. |
| Decide | Select the next open issue for this goal. Applies priority order (critical → high → medium → low), then sequential number. Skips already-attempted issues and issues with unsatisfied dependencies. |
| Execute | Write a spawn entry for the builder agent. The agent receives the issue title, description, and extracted acceptance criteria. |
| Verify | Wait for a task-done mailbox message. Check the agent summary for observable evidence (test passes, commit hash, explicit output). If evidence is missing or the agent reports failure, the issue is marked blocked. |
| Persist | Write a journal entry. Failure to write the journal pauses the run immediately — journal integrity is required before advancing. |
| Continue | Update progress counters, check budget limits, and start the next iteration. |
Guardrails
The following guardrails are enforced on every run. They cannot be disabled.
Concurrent run guard
If another autopilot run for the same goal is already active (status running or verifying), the new run is blocked immediately. This prevents two agents editing the same files simultaneously.
Scope declaration
Before a builder is spawned, the task must have all required fields: issue ID, sequence number, title, declared files, and extracted acceptance criteria. A spawn without a valid scope declaration stops the run rather than proceeding blind.
Evidence-required close
An issue is only marked done when the agent summary contains at least one observable signal:
If none of these patterns are present in the summary, the issue is marked blocked instead of done, and the consecutive-failure counter increments.
Journal mandatory
The continuity journal must be updated after every iteration. If the write fails, the run stops with reason checkpoint-required so you can investigate the disk state before continuing.
Drift detection
If the agent summary mentions CL-NNN references that do not include the expected issue sequence number, the autopilot treats the result as a drift event and marks the issue blocked. This catches agents that completed a different issue from the one they were assigned.
Scope containment
The loop implements a 50% out-of-scope threshold: if the majority of files edited by an agent fall outside the declared scope, the issue is marked blocked. In the current implementation the declared file list is always empty (acceptance criteria text is extracted, but file paths are not parsed out of it), so this guardrail is not enforced in practice. It is wired and unit-tested for when the feature is completed.
Budget limits
Every run has a budget cap. Defaults:
| Limit | Default | Meaning |
|---|---|---|
maxIssues | 10 | Maximum issues to complete per run. |
maxMinutes | 60 | Maximum wall-clock minutes per run. |
maxSpawns | 15 | Maximum agent spawns per run. |
maxCostUsd | $5.00 | Advisory cost limit (read from cost aggregator). |
Budget overrides are accepted by the backend IPC endpoint but the Goals panel UI does not yet expose them. The budget is checked before every new iteration, not just at the start.
Stop reasons
| Reason | Resumable? | Meaning |
|---|---|---|
no-actionable-issues | No | All issues for this goal are done, blocked, or already attempted this run. |
budget-exhausted | No | A budget limit (issues, minutes, spawns) was reached. |
consecutive-failures | No | Three consecutive issues failed or blocked without recovery. |
high-risk | No | A concurrent run guard blocked the start. |
user-pause | Yes | You pressed Pause during a run. |
checkpoint-required | Yes | The loop stopped at a safe point (supervised-step, journal failure, persist failure). |
product-decision | No | The issue scope declaration was missing required fields — human review needed. |
resumeCandidate issue stored in the checkpoint.Autopilot status in the Goals panel
While a run is active, the Goals panel shows a live autopilot status card:
| Field | Meaning |
|---|---|
| Run mode | Current mode: supervised-step or continuous. |
| Progress | Issues completed and spawns used so far in this run. |
| Current task | Issue ID and title currently being executed, if any. |
| Stop reason | Populated when the run has ended or paused. Explains why. |
| Resume candidate | Next issue that will be picked up on Resume, if the stop was resumable. |
Controls:
| Button | Available when |
|---|---|
| Run | Goal has open issues and no active autopilot run. |
| Pause | A run is currently active. Sends a pause signal; the run stops after the current issue completes. |
| Resume | Last run stopped with a resumable reason (user-pause or checkpoint-required). |
Frugal self-improvement signals
At the end of every run, Agenties reads the local signal aggregator for improvement candidates. This is a pure file read — no model is invoked, no agent is spawned, no action is taken automatically. If the aggregator finds a signal pattern with a count above the threshold (default 3), it surfaces a one-line hint alongside the stop event:
Acting on the hint always requires explicit user confirmation. The autopilot will never create a skill, open an issue, or spawn an agent to act on self-improvement data without asking first.