Autopilot — Supervised Autonomy

Agenties can run a goal's issues in sequence without waiting for you to issue each next command. This is supervised autonomy — not full automation. The orchestrator picks the next issue, spawns a builder agent, validates the result, and continues; but it stops at defined checkpoints so you remain in control.

Warning:Autopilot is not a "run and forget" mode. Agents still require observable evidence before an issue is marked done. Budget limits, guardrails, and mandatory checkpoints mean the loop will pause and ask for your confirmation rather than push through ambiguous situations.

Modes

Mode	Behavior
`supervised-step`	Runs exactly one issue, then pauses with a checkpoint. Resume when you are ready for the next one. This is the only mode available from the Goals panel today.
`continuous`	Runs issues back-to-back until a stop condition is hit (budget, blocker, or 3 consecutive failures). When evidence is missing the issue is marked blocked and the run continues; it does not interactively pause for confirmation. Not yet exposed in the Goals panel UI — available via the IPC API only.

Start a run from the Goals panel: select a goal that has open issues, then press Run Autopilot. The run always starts in supervised-step mode — one issue, then a checkpoint.

The observe → decide → execute → verify → persist loop

Each iteration of the autopilot loop follows these steps:

Step	What happens
Observe	Read the execution plan and current spawn budget. If budget pressure is high, stop immediately.
Decide	Select the next open issue for this goal. Applies priority order (critical → high → medium → low), then sequential number. Skips already-attempted issues and issues with unsatisfied dependencies.
Execute	Write a spawn entry for the builder agent. The agent receives the issue title, description, and extracted acceptance criteria.
Verify	Wait for a task-done mailbox message. Check the agent summary for observable evidence (test passes, commit hash, explicit output). If evidence is missing or the agent reports failure, the issue is marked blocked.
Persist	Write a journal entry. Failure to write the journal pauses the run immediately — journal integrity is required before advancing.
Continue	Update progress counters, check budget limits, and start the next iteration.

Guardrails

The following guardrails are enforced on every run. They cannot be disabled.

Concurrent run guard

If another autopilot run for the same goal is already active (status running or verifying), the new run is blocked immediately. This prevents two agents editing the same files simultaneously.

Scope declaration

Before a builder is spawned, the task must have all required fields: issue ID, sequence number, title, declared files, and extracted acceptance criteria. A spawn without a valid scope declaration stops the run rather than proceeding blind.

Evidence-required close

An issue is only marked done when the agent summary contains at least one observable signal:

Accepted evidence patterns

• Test output:  PASS / ✓ / "N tests passing"
• Runtime:      verified / confirmed / observed / rendered / persisted
• HTTP:         "responded with" / "curl response:" / "output:"
• Commit:       commit 7-char hash / git diff --stat

If none of these patterns are present in the summary, the issue is marked blocked instead of done, and the consecutive-failure counter increments.

Journal mandatory

The continuity journal must be updated after every iteration. If the write fails, the run stops with reason checkpoint-required so you can investigate the disk state before continuing.

Drift detection

If the agent summary mentions CL-NNN references that do not include the expected issue sequence number, the autopilot treats the result as a drift event and marks the issue blocked. This catches agents that completed a different issue from the one they were assigned.

Scope containment

The loop implements a 50% out-of-scope threshold: if the majority of files edited by an agent fall outside the declared scope, the issue is marked blocked. In the current implementation the declared file list is always empty (acceptance criteria text is extracted, but file paths are not parsed out of it), so this guardrail is not enforced in practice. It is wired and unit-tested for when the feature is completed.

Budget limits

Every run has a budget cap. Defaults:

Limit	Default	Meaning
`maxIssues`	10	Maximum issues to complete per run.
`maxMinutes`	60	Maximum wall-clock minutes per run.
`maxSpawns`	15	Maximum agent spawns per run.
`maxCostUsd`	$5.00	Advisory cost limit (read from cost aggregator).

Budget overrides are accepted by the backend IPC endpoint but the Goals panel UI does not yet expose them. The budget is checked before every new iteration, not just at the start.

Stop reasons

Reason	Resumable?	Meaning
`no-actionable-issues`	No	All issues for this goal are done, blocked, or already attempted this run.
`budget-exhausted`	No	A budget limit (issues, minutes, spawns) was reached.
`consecutive-failures`	No	Three consecutive issues failed or blocked without recovery.
`high-risk`	No	A concurrent run guard blocked the start.
`user-pause`	Yes	You pressed Pause during a run.
`checkpoint-required`	Yes	The loop stopped at a safe point (supervised-step, journal failure, persist failure).
`product-decision`	No	The issue scope declaration was missing required fields — human review needed.

Tip:Resumable runs show a Resume button in the Goals panel. The run picks up from theresumeCandidate issue stored in the checkpoint.

Autopilot status in the Goals panel

While a run is active, the Goals panel shows a live autopilot status card:

Field	Meaning
Run mode	Current mode: `supervised-step` or `continuous`.
Progress	Issues completed and spawns used so far in this run.
Current task	Issue ID and title currently being executed, if any.
Stop reason	Populated when the run has ended or paused. Explains why.
Resume candidate	Next issue that will be picked up on Resume, if the stop was resumable.

Controls:

Button	Available when
Run	Goal has open issues and no active autopilot run.
Pause	A run is currently active. Sends a pause signal; the run stops after the current issue completes.
Resume	Last run stopped with a resumable reason (user-pause or checkpoint-required).

Frugal self-improvement signals

At the end of every run, Agenties reads the local signal aggregator for improvement candidates. This is a pure file read — no model is invoked, no agent is spawned, no action is taken automatically. If the aggregator finds a signal pattern with a count above the threshold (default 3), it surfaces a one-line hint alongside the stop event:

Example hint

improvementSuggestion: "autopilot_iteration:CL-428 (4 signals — confirm before acting)"

Acting on the hint always requires explicit user confirmation. The autopilot will never create a skill, open an issue, or spawn an agent to act on self-improvement data without asking first.

Note:See Frugal Self-Improvement for the full signal model and the suggest_improvement MCP tool.

← PreviousSelf-Improvement Next →GitHub