Error Code Reference
testing-os’ CLIs surface structured errors at the top-level seam via renderTopLevelError (packages/dogfood-swarm/lib/error-render.js). Every typed error carries:
code— stable identifier (e.g.ISOLATION_FAILED)message— operator-facing prosehint— explicit next step (or a per-code derived hint when the error class did not set one)- optional
cause(Caused by: …),runId,waveId,agentRunId,findingsAttempted
CLI output shape:
ERROR [<CODE>]: <message> Next: <hint> Caused by: <inner error message> Wave: <waveId>Untyped errors keep the original ERROR: <message> single-line shape. A leading ERROR [<CODE>]: is the signal that one of the codes below is in play.
Severity tiers — fix order at a glance
Section titled “Severity tiers — fix order at a glance”| Severity | Visual cue | Meaning | Operator response |
|---|---|---|---|
| CRITICAL | :::danger callout (red ⊘) | Persistent state corrupted or contract broken; a record / index is wrong, not just absent | Stop ingesting, repair the underlying state, then resume |
| HIGH | :::caution callout (orange ⚠) | Operator action required before the system can make progress; one run lost | Diagnose using the hint, fix the upstream cause, re-dispatch |
| MEDIUM | :::note callout (blue ℹ) | Informational — a race or transient issue handled gracefully | Inspect the persisted state with the suggested CLI, then continue |
| LOW | :::tip callout (green ✓) | Caller bug surfaced as a state-machine reject; system state is consistent | Fix the caller; no recovery needed on the testing-os side |
Severity is encoded by the Starlight callout type at the top of each code below — color is paired with the icon and the bolded Severity: title, so a color-blind operator gets the same fix-order signal from the icon + word as a sighted operator gets from the hue. WCAG AA contrast ratios for each callout variant are asserted by scripts/check-severity-contrast.test.mjs.
RECORD_SCHEMA_INVALID
Section titled “RECORD_SCHEMA_INVALID”- Class:
RecordValidationError(packages/ingest/validate-record.js) - Trigger: A persisted record fails AJV validation against
dogfood-record.schema.json. Surfaced fromvalidateRecord()during ingest. - Message shape:
persisted record failed schema validation: <path> <ajv message>; <path> <ajv message>; … - Hint:
inspect the failing record against packages/schemas/src/json/dogfood-record.schema.json and fix the invalid fields before re-ingesting - Operator action:
- Open
packages/schemas/src/json/dogfood-record.schema.jsonand locate each path from the message. - The error object also carries
errors[]with{ path, keyword, message }for programmatic inspection. - Fix the upstream emitter (the source repo’s submission builder), not the schema. Schema is a contract.
- Re-dispatch the source workflow to produce a clean record.
- Open
DUPLICATE_RUN_ID
Section titled “DUPLICATE_RUN_ID”- Class:
DuplicateRunIdError(packages/ingest/persist.js) - Trigger:
writeRecordlost a TOCTOU race for the same canonical record path. Two concurrent writers tried to persist the samerun_id; the first won. - Message shape:
duplicate run_id: <run_id> — another writer won the race for <path> - Hint:
a run with this id already exists — use a fresh run id or \swarm runs` to inspect the existing one` - Carries:
runId,path - Operator action:
- In ingest: this is informational — the first writer succeeded, the system is consistent. Re-running the source workflow with a fresh
run_idproduces a new record. - In swarm:
swarm runslists existing runs by id. Either re-dispatch with a fresh id or accept the existing record.
- In ingest: this is informational — the first writer succeeded, the system is consistent. Re-running the source workflow with a fresh
ISOLATION_FAILED
Section titled “ISOLATION_FAILED”- Class:
IsolationError(packages/dogfood-swarm/lib/errors.js) - Trigger:
--isolatewas requested on aswarm dispatchbutcreateWorktree()failed. Pre-fix, dispatch silently fell back to running the agent in the main repo; isolation is now a contract — only valid responses are “isolated” or “loud failure”. - Message shape: the underlying worktree error wrapped with the explicit isolation context. Inspect
e.cause.messagefor the git-level reason. - Hint:
run \git worktree list` to inspect existing worktrees, or re-dispatch without —isolate` - Operator action:
git worktree listfrom the repo root to see what’s already attached.git worktree pruneto clean stale references;git worktree remove <path>to clear specific entries.- Re-dispatch with
--isolate, or drop--isolateif isolation is not required for this run (accepting the shared-workspace risk).
COLLECT_UPSERT_FAILED
Section titled “COLLECT_UPSERT_FAILED”- Class:
CollectUpsertError(packages/dogfood-swarm/lib/errors.js) - Trigger:
swarm collect’s findings upsert transaction threw. Common underlying causes: SQLitebusy_timeoutexhaustion, fingerprint UNIQUE collision, prepared-statement crash. The artifact rows + file_claims + agent state transitions had already committed; the wave-status UPDATE had not. - Message shape: structured wrapper with
e.cause.messagecarrying the SQLite-level reason. - Hint:
wave <id> has artifacts persisted but findings missing — inspect with \swarm status`, then re-run `swarm collect` once the underlying SQLite issue is resolved (busy_timeout or fingerprint UNIQUE collision)` - Carries:
waveId,findingsAttempted,cause - Operator action:
swarm statusto confirm the wave is in a half-written state (artifacts present, findings missing).- Diagnose the underlying SQLite issue from
Caused by:.busy_timeoutusually means another process holds the DB; check for stuckswarmprocesses. UNIQUE collision usually means the fingerprint algorithm matched an existing row — checkswarms/control-plane.dbfor the colliding finding. - Re-run
swarm collectfor the same wave once resolved. The outer wrapper is idempotent at the upsert level.
CONTROL_PLANE_SCHEMA_TOO_NEW
Section titled “CONTROL_PLANE_SCHEMA_TOO_NEW”- Class: plain
Error(no.codefield yet) thrown byopenDb—packages/dogfood-swarm/db/connection.js. Surfaces through the same top-level seam as the typed errors, but as the untypedERROR: <message>single-line shape (it has nocode/hint), not theERROR [<CODE>]:shape.CONTROL_PLANE_SCHEMA_TOO_NEWis the documentation identifier for this failure mode, not a value carried on the error object today. See “Follow-ups” below. - Trigger:
openDb()readschema_versionfrom the DB’skvtable and found it greater than theSCHEMA_VERSIONthis build understands. The sharedswarms/control-plane.dbis committed back tomainbyingest.yml; an operator on an older checkout (or a stale CI cache) can open a DB that a newermainalready migrated. Neither the create branch (version < 1) nor the upgrade branch (version < SCHEMA_VERSION) fires, so without this refusalopenDbwould silently proceed against an unknown-newer shape. - Message shape:
control-plane.db at <dbPath> is schema v<version> but this @dogfood-lab/dogfood-swarm build only understands v<SCHEMA_VERSION>. Pull the latest @dogfood-lab/dogfood-swarm before opening this DB. - Recovery (the message says it too): this is not DB corruption and needs no manual DB surgery — the remedy is to upgrade the tool to match the DB:
- Pull the latest
main/ re-install@dogfood-lab/dogfood-swarmso your build’sSCHEMA_VERSIONis>=the on-disk version. - Re-run the command.
openDbwill then take the normal create/upgrade path. - Do not hand-edit
swarms/control-plane.dbor delete it to “fix” the version — that discards the newer migrated state the newer build wrote.
- Pull the latest
- Follow-ups (out of scope for the doc-only fix that added this entry):
- Promote the plain
throw new Error(...)inconnection.jsto a typed error carryingcode: 'CONTROL_PLANE_SCHEMA_TOO_NEW'+ ahintso it renders throughrenderTopLevelErrorasERROR [CONTROL_PLANE_SCHEMA_TOO_NEW]:like the other codes here. - Add
packages/dogfood-swarm/db/connection.jsto theerror-codesdrift gate’ssourcesinscripts/doc-drift-patterns.jsononce the typed.codelands, so this entry is enforced by the same coverage gate as the rest of the family.
- Promote the plain
DISPATCH_RUN_NOT_FOUND
Section titled “DISPATCH_RUN_NOT_FOUND”- Class:
DispatchPreconditionError(packages/dogfood-swarm/lib/errors.js) - Trigger:
dispatch()looked upruns.idand got no row. Either the run id is mistyped, or noswarm inithas been run for this repo. - Message shape:
Run not found: <run-id> - Hint:
check \swarm runs` for the correct run id, or `swarm init` to create a fresh run` - NDJSON event emitted before throw:
dispatch_precondition_failedwithcode=DISPATCH_RUN_NOT_FOUND,runId,phase,correlation_id. - Operator action:
swarm runsto list all known runs.- If the run doesn’t exist,
swarm init <repo-path>to create it.
DISPATCH_DOMAINS_NOT_FROZEN
Section titled “DISPATCH_DOMAINS_NOT_FROZEN”- Class:
DispatchPreconditionError(packages/dogfood-swarm/lib/errors.js) - Trigger:
aredomainsFrozen(runId)returned false and--auto-freezewas not passed. - Message shape:
Domains are not frozen. Review and freeze before dispatching, or pass --auto-freeze. - Hint:
run \swarm domains—freeze` after reviewing, or re-run dispatch with —auto-freeze` - NDJSON event emitted before throw:
dispatch_precondition_failedwithcode=DISPATCH_DOMAINS_NOT_FROZEN. - Operator action:
swarm domains <run-id>to inspect the current draft.swarm domains <run-id> --freezeto lock the map, OR re-run with--auto-freeze.
DISPATCH_NO_DOMAINS
Section titled “DISPATCH_NO_DOMAINS”- Class:
DispatchPreconditionError(packages/dogfood-swarm/lib/errors.js) - Trigger:
getDomains(runId).length === 0. Usually meansswarm initproduced no auto-detected domains and the operator hasn’t added any manually. - Message shape:
No domains defined for this run - Hint:
run \swarm domains—add —globs ”[…]”` then —freeze` - NDJSON event emitted before throw:
dispatch_precondition_failedwithcode=DISPATCH_NO_DOMAINS. - Operator action:
swarm domains <run-id> --add <name> --globs '["packages/foo/**"]'to define at least one domain.swarm domains <run-id> --freeze.
DISPATCH_INVALID_PHASE
Section titled “DISPATCH_INVALID_PHASE”- Class:
DispatchPreconditionError(packages/dogfood-swarm/lib/errors.js) — same class as the otherDISPATCH_*preconditions;codeis part of the JSDoc union contract. - Trigger:
dispatch()checkedopts.phaseagainstAUDIT_PHASESandAMEND_PHASES(inpackages/dogfood-swarm/commands/dispatch.js) before any DB mutation and found neither matched — i.e. a mistyped phase such ashelth-audit-a. - Message shape:
Unknown phase: <phase> - Hint:
valid phases: <AUDIT_PHASES ∪ AMEND_PHASES>— currentlyhealth-audit-a, health-audit-b, health-audit-c, stage-d-audit, feature-audit, health-amend-a, health-amend-b, health-amend-c, stage-d-amend, feature-execute. When the thrown error carries no.hint,renderTopLevelErrorderives the same enumeration. - NDJSON event emitted before throw:
dispatch_precondition_failedwithcode=DISPATCH_INVALID_PHASE,runId,phase. - Carries:
runId,phase. - Operator action:
- Re-invoke with a phase from the list above, e.g.
swarm dispatch <run-id> health-audit-a. - The control plane is untouched — no cleanup is needed before retrying.
- Re-invoke with a phase from the list above, e.g.
CLI_INVALID_GLOBS_JSON
Section titled “CLI_INVALID_GLOBS_JSON”- Class:
CliInvalidGlobsError(packages/dogfood-swarm/lib/errors.js) - Trigger:
swarm domains --add/--edit --globs <raw>invoked with arawvalue that:- is empty
- fails
JSON.parse - parses to a non-array
- parses to an empty array
- contains a non-string element
- Message shape:
--globs requires a JSON array of glob strings; <specific reason> - Hint:
pass --globs '["packages/foo/**"]' — wrap the JSON in single quotes so the shell preserves it, and use double quotes for each glob string - Carries:
received(the raw input, possibly truncated),cause(the inner JSON.parse error message). - Operator action:
- Re-invoke with shell-safe quoting:
--globs '["packages/foo/**", "packages/bar/**"]'. - On Windows PowerShell, escape inner double quotes or use the single-quote outer form per shell rules.
- Re-invoke with shell-safe quoting:
CLI_INVALID_THRESHOLD
Section titled “CLI_INVALID_THRESHOLD”- Class: plain
Errorwithe.code = 'CLI_INVALID_THRESHOLD'set inparseVerifyFlags—packages/dogfood-swarm/cli.js. Surfaced through the same top-level seam (renderTopLevelError) asCLI_INVALID_GLOBS_JSON. - Trigger:
swarm verify --threshold <raw>(space-form--threshold Nor equals-form--threshold=N) invoked with arawvalue that is not a non-negative integer — e.g.foo,-1, or a partially-numeric3abc. Both flag forms route through the same validator, so a typo like--threshold=1O(letter O) is rejected rather than silently becoming the strictest gate (0). - Message shape:
--threshold expects a non-negative integer; got '<raw>' - Hint:
pass an integer >= 0, e.g. \—threshold 0` or `—threshold=3“ - Carries:
received(the raw input). - Operator action:
- Re-invoke with an integer
>= 0:swarm verify <run-id> --threshold 0. - A typo’d threshold exits non-zero by design — a CI gate keyed on
$?will not mistake a malformed threshold for a passing run.
- Re-invoke with an integer
FINDING_ID_COLLISION
Section titled “FINDING_ID_COLLISION”- Class: object-literal
{ code: 'FINDING_ID_COLLISION', findingId, error }(inwriteFindingserrors array) ANDFindingIdCollisionErrorclass (inwriteFindingsingleton) —packages/findings/derive/write-findings.js - Trigger: Two derivation rules generate the same
dfind-<repoSlug>-<lessonSlug>for the same submission (the id generator does NOT yet discriminate byrule_id), and the resulting batch — OR two same-process singleton calls — try to write to the same path. The batch helperwriteFindingscollects collisions intoerrors[]; the singletonwriteFindingthrows. - Message shape:
intra-batch finding_id collision: '<id>' already claimed by index <N>; refused write at index <M> to avoid silent clobber (D2B-008)(batch) orfinding_id collision: '<id>' already written in this process; refused to silently clobber (D2B-008 / L3-001 family-seal)(singleton). - Hint: rename or skip the colliding finding before re-running
dogfood findings derive --write. If two rules legitimately share a lesson slug, the structural fix is to differentiate them ingenerateFindingId(rule_id in the slug) — deferred to a follow-on wave. - Operator action:
- Run
dogfood findings derive(without--write) to see which rule pairs are colliding. - Either skip the duplicate at the source rule, or extend the id generator to include
rule_idin the slug. - If a re-write is legitimate (e.g. after an intentional disk wipe in a test), call
resetSeenWrites(rootDir)between the two calls.
- Run
PATTERN_ID_COLLISION
Section titled “PATTERN_ID_COLLISION”- Class: object-literal
{ code: 'PATTERN_ID_COLLISION', patternId, error }(inwritePatternserrors array) ANDPatternIdCollisionErrorclass (inwritePatternsingleton) —packages/findings/synthesis/write-artifacts.js - Trigger: Two synthesis rules emit the same
dpat-<slug>(cluster-key collision) and the resulting batch tries to write both, or two same-process singleton calls collide. - Message shape:
intra-batch pattern_id collision: '<id>' already claimed by index <N>; refused write at index <M> to avoid silent clobber (D2B-008)(batch) or singleton variant. - Hint: same as
FINDING_ID_COLLISION— fix the duplicating rule or wipe and re-run. - Operator action: as above, for patterns.
RECOMMENDATION_ID_COLLISION
Section titled “RECOMMENDATION_ID_COLLISION”- Class: object-literal
{ code: 'RECOMMENDATION_ID_COLLISION', recommendationId, error }(batch) ANDRecommendationIdCollisionError(singleton) —packages/findings/synthesis/write-artifacts.js - Trigger: Two recommendation derivations emit the same
drec-<slug>. - Message shape: as above, with
recommendation_idin the message. - Hint: as above.
- Operator action: as above, for recommendations.
DOCTRINE_ID_COLLISION
Section titled “DOCTRINE_ID_COLLISION”- Class: object-literal
{ code: 'DOCTRINE_ID_COLLISION', doctrineId, error }(batch) ANDDoctrineIdCollisionError(singleton) —packages/findings/synthesis/write-artifacts.js - Trigger: Two doctrine derivations emit the same
ddoc-<slug>. - Message shape: as above, with
doctrine_idin the message. - Hint: as above.
- Operator action: as above, for doctrine.
FINDING_SCHEMA_INVALID
Section titled “FINDING_SCHEMA_INVALID”- Class:
FindingValidationError(packages/findings/derive/write-findings.js) - Trigger:
writeFinding/writeFindingsinvoked with a finding object that fails AJV validation. Pre-fix, library-path writers had no schema gate (the CLI gated, but programmatic callers did not). - Message shape:
finding failed schema validation (<finding_id>): <path> <message>; <path> <message>; … - Hint: inspect each path against
packages/schemas/src/json/dogfood-finding.schema.jsonand fix the upstream emitter. Schema is a contract. - Carries:
findingId,errors[](AJV-shaped{ path, message }). - Operator action: same as
RECORD_SCHEMA_INVALID— fix the emitter, not the schema.
PATTERN_SCHEMA_INVALID
Section titled “PATTERN_SCHEMA_INVALID”- Class:
PatternValidationError(packages/findings/synthesis/write-artifacts.js) - Trigger:
writePattern/writePatternsinvoked with a malformed pattern. Pre-fix, the synthesis writers had ZERO validation (not even CLI-side) — this was the worst gap of the family. - Message shape:
pattern failed schema validation (<pattern_id>): <path> <message>; … - Hint: inspect against
packages/schemas/src/json/dogfood-pattern.schema.jsonand fix the derivation rule. - Carries:
patternId,errors[]. - Operator action: fix the derivation rule.
RECOMMENDATION_SCHEMA_INVALID
Section titled “RECOMMENDATION_SCHEMA_INVALID”- Class:
RecommendationValidationError(packages/findings/synthesis/write-artifacts.js) - Trigger:
writeRecommendation/writeRecommendationsinvoked with a malformed recommendation. - Message shape:
recommendation failed schema validation (<recommendation_id>): <path> <message>; … - Hint: inspect against the recommendation schema and fix the rule.
- Carries:
recommendationId,errors[]. - Operator action: fix the derivation rule.
DOCTRINE_SCHEMA_INVALID
Section titled “DOCTRINE_SCHEMA_INVALID”- Class:
DoctrineValidationError(packages/findings/synthesis/write-artifacts.js) - Trigger:
writeDoctrine/writeDoctrinesinvoked with a malformed doctrine. - Message shape:
doctrine failed schema validation (<doctrine_id>): <path> <message>; … - Hint: inspect against the doctrine schema and fix the rule.
- Carries:
doctrineId,errors[]. - Operator action: fix the derivation rule.
VALIDATOR_FAULT_SCHEMA
Section titled “VALIDATOR_FAULT_SCHEMA”- Class: template-literal
\VALIDATOR_FAULT_${cls}`emitted byrunValidator(‘schema’, fn)catch —packages/verify/index.js` - Trigger: the
validateSchemacall threw (e.g. AJV crash on a pathological regex, an unexpected reference resolution failure, or an internal assertion). The error string-prefix discriminatesVALIDATOR_FAULT_SCHEMA:from the submission-badschema:prefix. - Message shape: appears as a string entry in
verification.rejection_reasons:VALIDATOR_FAULT_SCHEMA: <thrown message>. - Hint: the verifier itself crashed mid-validation — escalate to ops; do not page the submitter. Inspect the validator stack and patch the verifier.
- Operator action:
- Pull the
VALIDATOR_FAULT_SCHEMA:reasons out ofverification.rejection_reasonsand triage them as a system incident. - Re-run with verbose logging on the schema validator to capture the throw site.
- Patch the validator; the submission is a useful repro fixture, NOT the bug.
- Pull the
VALIDATOR_FAULT_POLICY
Section titled “VALIDATOR_FAULT_POLICY”- Class: template-literal
\VALIDATOR_FAULT_${cls}`emitted byrunValidator(‘policy’, fn)catch —packages/verify/index.js` - Trigger: the policy-validator call threw (e.g. deep-merge corrupted by a prototype-pollution probe, an unexpected policy shape from
loadRepoPolicy). - Message shape:
VALIDATOR_FAULT_POLICY: <thrown message>inrejection_reasons[]. - Hint: as above — verifier-side incident, not submission-bad.
- Operator action: triage as a system incident, patch the policy validator.
VALIDATOR_FAULT_STEPS
Section titled “VALIDATOR_FAULT_STEPS”- Class: template-literal
\VALIDATOR_FAULT_${cls}`emitted byrunValidator(‘steps’, fn)catch —packages/verify/index.js` - Trigger: the step-contract checker threw (e.g. an evidence-shape walk hit an unexpected nesting, a gate-accumulation arithmetic edge).
- Message shape:
VALIDATOR_FAULT_STEPS: <thrown message>inrejection_reasons[]. - Hint: as above — verifier-side incident.
- Operator action: triage as a system incident, patch the steps validator.
STATE_MACHINE_<KIND> — BLOCKED, TERMINAL, INVALID
Section titled “STATE_MACHINE_<KIND> — BLOCKED, TERMINAL, INVALID”- Class:
StateMachineRejectionError(packages/dogfood-swarm/lib/errors.js) - Trigger:
transitionAgent()rejected a state-machine transition. Thekindfield discriminates why:STATE_MACHINE_BLOCKED— the transition is legal in the abstract but blocked by a guard (e.g. dependencies not met, override required). Operator’s problem.STATE_MACHINE_TERMINAL— the agent is in a terminal state (complete,rejected, etc.) — no transitions allowed. Caller bug — something tried to advance an already-finished agent.STATE_MACHINE_INVALID— the transition is missing from theTRANSITIONStable. Legitimate disallowed transition (e.g.idle → completeskippingrunning).
- Message shape:
Illegal transition <from> → <to>: <reason>with explicit kind ine.code. - Hint:
e.hintis set per-kind by the throwing site (e.g. “useswarm revalidateto lawfully recover from blocked states” for BLOCKED, “this agent is already complete; check why the caller tried to re-advance it” for TERMINAL). - Carries:
kind,from,to,agentRunId,allowedTransitions[](legaltoset from the currentfrom). - Operator action:
- BLOCKED: look at the
Next:hint — usually points at an override flag or a missing prerequisite. - TERMINAL: the agent is done; the bug is upstream. Inspect the caller for a re-advance loop.
- INVALID: check
allowedTransitions[]for what the state machine will accept from thisfrom. Either reroute the call or, if the transition should be legal, file a finding to add the edge toTRANSITIONS.
- BLOCKED: look at the
INGEST_FAILED
Section titled “INGEST_FAILED”- Class: structured stderr envelope (not a thrown typed error) —
console.error('ERROR [INGEST_FAILED]: …')emitted at the swarm CLI ingest seam (packages/dogfood-swarm/cli.js) and frompackages/dogfood-swarm/persist-results.js. Mirrors the documentedERROR [<CODE>]:shape even though it is printed rather than rendered throughrenderTopLevelError. - Trigger: the
--ingestpath attempted to record the run’s own dogfood submission and the downstream ingest either returnedingested !== true(CLI seam, with the verifier’sreason) or exited non-zero (persist-results.jsseam). Common underlying cause: the swarm-emitted submission failed schema validation inpackages/ingest/run.js. - Message shape:
- CLI seam:
ERROR [INGEST_FAILED]: dogfood ingest did not complete — <reason> - persist-results seam:
ERROR [INGEST_FAILED]: dogfood ingest exited non-zero - Both follow the failure line with
Submission: <path>and a copy-pasteableReproduce: node "<repo>/packages/ingest/run.js" --provenance=stub --file "<submission>"line; the persist-results seam also printsExit code: <n>when available.
- CLI seam:
- Operator action:
- Run the printed
Reproduce:command to replay the ingest in isolation with full output. - The most common cause is a schema-invalid submission — inspect the AJV failure against
packages/schemas/src/json/dogfood-record.schema.jsonand fix the swarm’s submission emitter, not the schema. - Re-run
swarm verify --ingestonce the emitter is corrected. The human-readable summary still printsIngested: NOto stdout so the failure is visible in both streams.
- Run the printed
Cross-references
Section titled “Cross-references”- Hard Gate B (Errors): structured shape (code/message/hint), exit codes for CLI, no raw stacks. See README threat model.
- The state machine these errors come out of: State Machines.
- Where rejected records land when ingest throws
RECORD_SCHEMA_INVALIDorDUPLICATE_RUN_ID:records/_rejected/(Beginner’s Guide → Investigating a failure).