Reduce the Burden. Improve the Experience.
Proof-backed audits for human-facing product surfaces. Generic scanners catch WCAG violations. These audits catch interfaces that pass scanners but still make users hunt — load displacement, hidden complexity, AI trust burden, state-shift failure.
Invoke
Run cognitive-load audit on https://example.com
Outputs
audits/cognitive-load/evidence/<run-id>/
├── cognitive-load-findings.md
├── cognitive-load-scorecard.json
└── remediation-priority-list.md
Verify
npm run verify # schemas + links + shipcheck
What makes these different
Catches the failures generic scanners miss.
Names where the load went
Every finding declares one or more values from an 11-item enum (search, memory, trust, source recovery, navigation, configuration, visual decoding, time, recovery/undo, feature loss, verification). No vague "this is hard to use" findings.
Schema-validated outputs
Findings and scorecards conform to JSON Schema (shared/schemas/). CI rejects malformed evidence. The audit walks its own talk.
No evidence, no official audit
Each audit must ship four things: Rubric, Skill, Schema, and at least one Evidence run. Lifecycle states (Draft → Pressure-tested → Frozen → Dogfooded → Revised) are declared in each audit's README.
Honest evidence states
Findings are marked Observed, Inferred, or Open question. Findings without evidence are downgraded to Open questions, not laundered into confirmed issues. The Inferred fallback is the brake against manufacturing findings.
Sub-taxonomy for power preservation
Section 5 distinguishes Compressed (Pass), Delayed (Medium), Hidden (High), and Removed (Critical) — separating "the icon got smaller" from "the feature is unavailable in this mode."
AI compression risk, calibrated
Section 4 has a severity precondition: Critical only when a source was promised, claimed, fetched, or replaced. Generic unsourced model output is High/Medium provenance ambiguity. Prevents over-firing on every chatbot.
Current audits
cognitive-load
state: Frozen v0.2 + Dogfooded once
audit_prefix: CL
catches: load displacement
# 4 evidence runs:
# pt0/ claude.ai live
# pt1-github-narrow/ responsive audit
# pt2-outlook-doc-fallback/ vendor docs + reclassification
# dogfood-1-research-os/ own tool — clean Warn Invoke via Claude
> Run cognitive-load audit on https://your.app/dashboard
# Produces:
# findings.md scorecard.json remediation-priority-list.md Local verify tooling
Clone + install
git clone https://github.com/dogfood-lab/interface-audits.git
cd interface-audits
npm install Run all checks
npm run verify # schemas + links + shipcheck audit
npm run verify:schemas
npm run verify:links