Docs

Autopilot — Supervised Autonomy

Agenties can run a goal's issues in sequence without waiting for you to issue each next command. This is supervised autonomy — not full automation. The orchestrator picks the next issue, spawns a builder agent, validates the result, and continues; but it stops at defined checkpoints so you remain in control.

Warning:Autopilot is not a "run and forget" mode. Agents still require observable evidence before an issue is marked done. Budget limits, guardrails, and mandatory checkpoints mean the loop will pause and ask for your confirmation rather than push through ambiguous situations.

Modes

ModeBehavior
supervised-stepRuns exactly one issue, then pauses with a checkpoint. Resume when you are ready for the next one. This is the only mode available from the Goals panel today.
continuousRuns issues back-to-back until a stop condition is hit (budget, blocker, or 3 consecutive failures). When evidence is missing the issue is marked blocked and the run continues; it does not interactively pause for confirmation. Not yet exposed in the Goals panel UI — available via the IPC API only.

Start a run from the Goals panel: select a goal that has open issues, then press Run Autopilot. The run always starts in supervised-step mode — one issue, then a checkpoint.


The observe → decide → execute → verify → persist loop

Each iteration of the autopilot loop follows these steps:

StepWhat happens
ObserveRead the execution plan and current spawn budget. If budget pressure is high, stop immediately.
DecideSelect the next open issue for this goal. Applies priority order (critical → high → medium → low), then sequential number. Skips already-attempted issues and issues with unsatisfied dependencies.
ExecuteWrite a spawn entry for the builder agent. The agent receives the issue title, description, and extracted acceptance criteria.
VerifyWait for a task-done mailbox message. Check the agent summary for observable evidence (test passes, commit hash, explicit output). If evidence is missing or the agent reports failure, the issue is marked blocked.
PersistWrite a journal entry. Failure to write the journal pauses the run immediately — journal integrity is required before advancing.
ContinueUpdate progress counters, check budget limits, and start the next iteration.

Guardrails

The following guardrails are enforced on every run. They cannot be disabled.

Concurrent run guard

If another autopilot run for the same goal is already active (status running or verifying), the new run is blocked immediately. This prevents two agents editing the same files simultaneously.

Scope declaration

Before a builder is spawned, the task must have all required fields: issue ID, sequence number, title, declared files, and extracted acceptance criteria. A spawn without a valid scope declaration stops the run rather than proceeding blind.

Evidence-required close

An issue is only marked done when the agent summary contains at least one observable signal:

Accepted evidence patterns
• Test output:  PASS / ✓ / "N tests passing"
• Runtime:      verified / confirmed / observed / rendered / persisted
• HTTP:         "responded with" / "curl response:" / "output:"
• Commit:       commit 7-char hash / git diff --stat

If none of these patterns are present in the summary, the issue is marked blocked instead of done, and the consecutive-failure counter increments.

Journal mandatory

The continuity journal must be updated after every iteration. If the write fails, the run stops with reason checkpoint-required so you can investigate the disk state before continuing.

Drift detection

If the agent summary mentions CL-NNN references that do not include the expected issue sequence number, the autopilot treats the result as a drift event and marks the issue blocked. This catches agents that completed a different issue from the one they were assigned.

Scope containment

The loop implements a 50% out-of-scope threshold: if the majority of files edited by an agent fall outside the declared scope, the issue is marked blocked. In the current implementation the declared file list is always empty (acceptance criteria text is extracted, but file paths are not parsed out of it), so this guardrail is not enforced in practice. It is wired and unit-tested for when the feature is completed.


Budget limits

Every run has a budget cap. Defaults:

LimitDefaultMeaning
maxIssues10Maximum issues to complete per run.
maxMinutes60Maximum wall-clock minutes per run.
maxSpawns15Maximum agent spawns per run.
maxCostUsd$5.00Advisory cost limit (read from cost aggregator).

Budget overrides are accepted by the backend IPC endpoint but the Goals panel UI does not yet expose them. The budget is checked before every new iteration, not just at the start.


Stop reasons

ReasonResumable?Meaning
no-actionable-issuesNoAll issues for this goal are done, blocked, or already attempted this run.
budget-exhaustedNoA budget limit (issues, minutes, spawns) was reached.
consecutive-failuresNoThree consecutive issues failed or blocked without recovery.
high-riskNoA concurrent run guard blocked the start.
user-pauseYesYou pressed Pause during a run.
checkpoint-requiredYesThe loop stopped at a safe point (supervised-step, journal failure, persist failure).
product-decisionNoThe issue scope declaration was missing required fields — human review needed.
Tip:Resumable runs show a Resume button in the Goals panel. The run picks up from theresumeCandidate issue stored in the checkpoint.

Autopilot status in the Goals panel

While a run is active, the Goals panel shows a live autopilot status card:

FieldMeaning
Run modeCurrent mode: supervised-step or continuous.
ProgressIssues completed and spawns used so far in this run.
Current taskIssue ID and title currently being executed, if any.
Stop reasonPopulated when the run has ended or paused. Explains why.
Resume candidateNext issue that will be picked up on Resume, if the stop was resumable.

Controls:

ButtonAvailable when
RunGoal has open issues and no active autopilot run.
PauseA run is currently active. Sends a pause signal; the run stops after the current issue completes.
ResumeLast run stopped with a resumable reason (user-pause or checkpoint-required).

Frugal self-improvement signals

At the end of every run, Agenties reads the local signal aggregator for improvement candidates. This is a pure file read — no model is invoked, no agent is spawned, no action is taken automatically. If the aggregator finds a signal pattern with a count above the threshold (default 3), it surfaces a one-line hint alongside the stop event:

Example hint
improvementSuggestion: "autopilot_iteration:CL-428 (4 signals — confirm before acting)"

Acting on the hint always requires explicit user confirmation. The autopilot will never create a skill, open an issue, or spawn an agent to act on self-improvement data without asking first.

Note:See Frugal Self-Improvement for the full signal model and the suggest_improvement MCP tool.