Intelligence Layer
The intelligence layer turns dogfood evidence into reusable lessons, patterns, recommendations, and doctrine that future projects can inherit.
This page documents the finding-review state machine (candidate → reviewed → accepted → invalidated). It is one of the four distinct status vocabularies in testing-os. For the record-classification vocabulary (accepted / rejected / portfolio buckets) see Architecture and Operating Guide. For the wave-finding classification (new / recurring / fixed / unverified) emitted by dogfood-swarm runs, see the State Machines reference. For the agent_run lifecycle (the fourth vocabulary — pending / dispatched / running / complete / failed / timed_out / invalid_output / ownership_violation / aborted_for_rewind), see the same reference. The four vocabularies share words but operate on different objects — read the glossary if you have not already.
The Learning Loop
Section titled “The Learning Loop”record → finding → reviewed memory → pattern / doctrine → future guidanceEvery step is evidence-bound, deterministic, and auditable. No LLM in the extraction or clustering loop.
Four New Artifact Types
Section titled “Four New Artifact Types”The intelligence layer adds four contracts to testing-os’ original three (record, scenario, policy):
Finding
Section titled “Finding”An evidence-bound lesson extracted from one or more dogfood runs.
- Status lifecycle:
candidate → reviewed → accepted → (invalidated) - Must reference at least one source record and one evidence item
- Classification: issue_kind, root_cause_kind, remediation_kind, transfer_scope
- Stored as YAML in
findings/<org>/<repo>/
Pattern
Section titled “Pattern”A repeated lesson cluster backed by 2+ accepted findings.
- Formed by dimension-based clustering (issue_kind + root_cause_kind)
- False recurrence detection prevents duplicate-incident inflation
- Strength levels: emerging, strong, portfolio_stable
Recommendation
Section titled “Recommendation”Actionable guidance derived from accepted patterns.
- Kinds: starter_check, starter_scenario, policy_seed, evidence_expectation, verification_rule, review_prompt
- Each recommendation includes a specific action with type, target, and details
- Confidence tracks pattern strength
Doctrine
Section titled “Doctrine”Hardened portfolio rules earned from repeated patterns.
- Only promoted from strong or portfolio_stable patterns
- org_wide scope requires 2+ supporting patterns
- Statement reads as a directive, not a suggestion
Derivation Engine
Section titled “Derivation Engine”Eight deterministic rules extract candidate findings from verified dogfood records:
| Rule | Fires on | Issue |
|---|---|---|
| Surface misclassification | Invalid product_surface enum | surface_misclassification |
| Evidence policy mismatch | Evidence requirement rejection | evidence_overconstraint |
| Verdict downgrade | Proposed pass downgraded | schema_mismatch / policy_mismatch |
| Scenario step failure | Step-level failures | build_output_mismatch / entrypoint_truth |
| Blocked scenario | Blocked verdict with reason | verification_gap |
| Execution mode gap | Mixed/human missing attestation | execution_mode_mismatch |
| Schema rejection | Non-surface schema failures | schema_mismatch |
| Policy rejection | Non-evidence policy failures | policy_mismatch |
Every emitted finding includes the rule ID, rationale, and exact evidence references.
Review Workflow
Section titled “Review Workflow”Findings move through a lawful state machine:
- candidate — machine- or human-created, not yet reviewed
- reviewed — looked at by a human, may need refinement
- accepted — approved as trustworthy reusable learning
- rejected — declined with explicit reason
Available actions: accept, reject, review, edit, merge, reopen, invalidate
All actions are logged in an append-only event log with actor, timestamp, from/to status, field diffs, and reasons.
Two or more findings describing the same lesson can be merged into one canonical finding. The merge preserves all evidence, source record IDs, and lineage. Source findings are marked superseded.
Invalidation
Section titled “Invalidation”Accepted findings can be invalidated when source truth changes. Invalidated findings are excluded from advice queries but retained for historical reference.
Promoting Artifacts (closing the loop)
Section titled “Promoting Artifacts (closing the loop)”Synthesis writes patterns, recommendations, and doctrine with status: candidate. The advice surface (queryPatterns / queryRecommendations / queryDoctrine) returns only accepted artifacts — so until a candidate is reviewed and accepted, nothing the intelligence layer derives reaches a future project. The artifact review verbs are what close that loop.
# Promote a derived pattern into the advise surface — accept = the loop closesnode packages/findings/cli.js patterns accept <pattern_id> --actor <name> --reason "..."
# Same for recommendations and doctrinenode packages/findings/cli.js recommendations accept <recommendation_id> --actor <name>node packages/findings/cli.js doctrine accept <doctrine_id> --actor <name>
# Reject (reason required) or, for patterns, invalidate an accepted onenode packages/findings/cli.js patterns reject <pattern_id> --actor <name> --reason "not a real recurrence"node packages/findings/cli.js patterns invalidate <pattern_id> --actor <name> --reason "source changed"
# What is awaiting review?node packages/findings/cli.js patterns queueThe artifact review law reuses the finding status law (review/transitions.js). It is intentionally narrower than the finding lifecycle because the artifact schemas are narrower:
- pattern status ∈ candidate / accepted / rejected / invalidated
- recommendation and doctrine status ∈ candidate / accepted / rejected
Because no artifact schema permits the intermediate reviewed status, the review and reopen verbs (which target reviewed) are refused for artifacts rather than writing a schema-invalid file. invalidate is supported for patterns only — recommendations and doctrine have no invalidated status, so use reject to retire them. Every accepted/rejected/invalidated decision is written through the synthesis writers, which re-validate the artifact against its JSON Schema, and is logged in the same append-only event log (carrying artifact_id + artifact_kind).
Re-derivation never clobbers a decision
Section titled “Re-derivation never clobbers a decision”Re-running <type> derive --write produces the same deterministic ids. A freshly-derived candidate that collides with an artifact you already promoted (accepted / rejected / invalidated) is preserved, not overwritten — the operator’s decision is load-bearing. Collisions are reported as Preserved (operator-promoted, not overwritten). This mirrors the findings dedupe (derive/dedupe.js) and lives in synthesis/dedupe-artifacts.js.
Applying Recommendations Back
Section titled “Applying Recommendations Back”An accepted recommendation whose action is a structured add_scenario / add_check can be applied directly into a named repo policy:
# Preview (default) — renders the change, writes nothingnode packages/findings/cli.js recommendations apply <recommendation_id> --policy <org/repo>
# Apply the structured intent + record provenancenode packages/findings/cli.js recommendations apply <recommendation_id> --write --policy <org/repo> --actor <name>This is honest partial automation:
- Only an accepted recommendation is applicable; others refuse with a structured
{ code, message, hint }error. --writeapplies only the structuredtargetid — adding it to the policy’ssurfaces.<surface>.required_scenarios— and recordsrecommendation_id+detailsas provenance.- The free-text
action.detailsis never injected as policy logic. - Free-text-only action types (
set_policy,set_evidence,set_verification,add_review_step) and ambiguous targets (no named policy, or a recommendation spanning multiple surfaces) refuse--writewith a hint to apply manually. Never a fake auto-apply.
Advice Surface
Section titled “Advice Surface”The adoption layer answers future-project questions directly:
# What should a new MCP server repo inherit?node packages/findings/cli.js advise --surface mcp-server
# What about a desktop app with mixed-mode dogfood?node packages/findings/cli.js advise --surface desktop --execution-mode mixed
# Export all accepted learning for repo-knowledgenode packages/findings/cli.js sync-export --jsonAdvice bundles include:
- Starter checks and scenarios
- Evidence expectations
- Likely failure classes (top 3)
- Relevant doctrine
- Supporting pattern and finding IDs
Results are ranked (stronger and more specific first) and capped (max 5 recommendations, 5 doctrine, 3 failure classes).
CLI Reference
Section titled “CLI Reference”Finding management
Section titled “Finding management”list— list all findings with filtersshow <id>— show finding detailvalidate— validate all findings against schemaderive --all --dry-run— derive candidates from recordsexplain <id>— show derivation provenance
Review
Section titled “Review”accept <id> --actor <name> --reason "..."— promote to acceptedreject <id> --actor <name> --reason "..."— reject with reasonedit <id> --actor <name> --set field=value— edit fieldsmerge <id1> <id2> --into <canonical> --actor <name> --reason "..."— merge findingsinvalidate <id> --actor <name> --reason "..."— invalidate accepted findingreopen <id> --actor <name>— reopen rejected/accepted findinghistory <id>— show review audit trailqueue— show pending review work
Synthesis
Section titled “Synthesis”patterns derive [--write]— derive patterns from accepted findingsrecommendations derive [--write]— derive from accepted patternsdoctrine derive [--write]— derive from strong patterns
Artifact review (close the loop)
Section titled “Artifact review (close the loop)”patterns accept <id> --actor <name> [--reason "..."]— promote a candidate pattern into the advise surfacepatterns reject <id> --actor <name> --reason "..."— reject a pattern (reason required)patterns invalidate <id> --actor <name> --reason "..."— invalidate an accepted pattern (patterns only)patterns queue— patterns awaiting reviewrecommendations accept|reject <id> --actor <name>— review a recommendationrecommendations apply <id> [--dry-run | --write] [--policy <org/repo>] [--actor <name>]— apply an accepted recommendation into a policyrecommendations queuedoctrine accept|reject <id> --actor <name>— review a doctrinedoctrine queue
Adoption
Section titled “Adoption”advise --surface <surface> [--execution-mode <mode>]— get advice bundlesync-export [--json]— export for repo-knowledge
Integration
Section titled “Integration”| System | Role |
|---|---|
| testing-os | Source of truth — owns all learning artifacts |
| repo-knowledge | Consumer — syncs accepted artifacts via sync-export |
| role-os | Consumer — pulls advice into bootstrap/review contexts |
| shipcheck | Enforcement — uses dogfood status, not intelligence layer directly |
Test Coverage
Section titled “Test Coverage”246 finding tests at v1.1.7 (covering contract spine, derivation, review, synthesis, and adoption modules). The precise per-module breakdown drifts with each release and is intentionally not kept in lockstep with this slow-moving handbook page. See docs/m5-validation-2026-04-29.md for the authoritative current matrix.