Getting Started
This guide takes you from cloning the repo to invoking your first audit — about 15 minutes if you have Node 20+ and a Claude account.
Requirements
Section titled “Requirements”- Node.js ≥ 20 for the local verify tooling (schema validation, link check, shipcheck audit).
- Git to clone the repo.
- Claude (or a compatible AI runner) with browser-navigation MCP tools to actually invoke an audit. The audits themselves are markdown rubrics — they don’t execute natively.
Clone and install
Section titled “Clone and install”git clone https://github.com/dogfood-lab/interface-audits.gitcd interface-auditsnpm installnpm install brings in three dev dependencies (ajv, ajv-formats, glob) used by the verify scripts. No production deps — the audits themselves are plain markdown.
Run the verify tooling
Section titled “Run the verify tooling”npm run verifyThis runs three checks in sequence:
verify:schemas— every*-scorecard.jsonunderaudits/*/evidence/<run-id>/validates againstshared/schemas/scorecard.base.schema.json, and every finding inside validates againstshared/schemas/finding.base.schema.json.verify:links— every relative markdown link in every*.mdfile resolves to an existing file. Skips external links, anchors, and links inside fenced or inline code blocks.shipcheck audit— hard gates A–D of the ship gate must pass before any release is cut.
On a clean repo, all three exit 0.
Invoke your first audit
Section titled “Invoke your first audit”The first audit is Cognitive Load. Tell Claude:
Run cognitive-load audit on
<target-url-or-surface>
Pick a target — a docs site, a dashboard, an internal tool. Claude will:
- Walk the 8 sections of the rubric (
audits/cognitive-load/RUBRIC.md). - Probe each section against your target’s live state (via browser-navigation MCP tools).
- Produce three outputs under
audits/cognitive-load/evidence/<run-id>/:cognitive-load-findings.md— the full finding reportcognitive-load-scorecard.json— per-section pass/warn/fail + summaryremediation-priority-list.md— findings ordered by severity × leverage
See Usage for what to do with those outputs.
Read past evidence
Section titled “Read past evidence”The repo ships with four completed audit runs you can browse:
evidence/pt0/— Pressure Test 0 on claude.ai (produced the v0.1 rubric patches)evidence/pt1-github-narrow/— Pressure Test 1 on GitHub’s responsive layout (produced the v0.2 Section 5 taxonomy)evidence/pt2-outlook-doc-fallback/— Pressure Test 2 on Outlook’s Simplified Ribbon, run as documentation-fallback. First draft overclaimed Removed findings; honest reclassification moved them to Hidden. The calibration record is in the auditor notes.evidence/dogfood-1-research-os-handbook/— Dogfood Run 1 on the research-os handbook. Healthy result: 8 findings + 4 positive observations, no rubric churn.
Each run is three files. Start with *-findings.md.
What’s next
Section titled “What’s next”- Usage — invoking audits in detail, reading scorecards, interpreting remediation lists
- Reference — the rubric format, finding format, full load-displaced-to enum
- Architecture — how audits are structured, the lifecycle, the four-thing rule