v0.1.0 · MIT

Reduce the Burden. Improve the Experience.

Proof-backed audits for human-facing product surfaces. Generic scanners catch WCAG violations. These audits catch interfaces that pass scanners but still make users hunt — load displacement, hidden complexity, AI trust burden, state-shift failure.

View on GitHub Read the Handbook

Invoke

Run cognitive-load audit on https://example.com

Outputs

audits/cognitive-load/evidence/<run-id>/ ├── cognitive-load-findings.md ├── cognitive-load-scorecard.json └── remediation-priority-list.md

Verify

npm run verify # schemas + links + shipcheck

What makes these different

Catches the failures generic scanners miss.

Names where the load went

Every finding declares one or more values from an 11-item enum (search, memory, trust, source recovery, navigation, configuration, visual decoding, time, recovery/undo, feature loss, verification). No vague "this is hard to use" findings.

Schema-validated outputs

Findings and scorecards conform to JSON Schema (shared/schemas/). CI rejects malformed evidence. The audit walks its own talk.

No evidence, no official audit

Each audit must ship four things: Rubric, Skill, Schema, and at least one Evidence run. Lifecycle states (Draft → Pressure-tested → Frozen → Dogfooded → Revised) are declared in each audit's README.

Honest evidence states

Findings are marked Observed, Inferred, or Open question. Findings without evidence are downgraded to Open questions, not laundered into confirmed issues. The Inferred fallback is the brake against manufacturing findings.

Sub-taxonomy for power preservation

Section 5 distinguishes Compressed (Pass), Delayed (Medium), Hidden (High), and Removed (Critical) — separating "the icon got smaller" from "the feature is unavailable in this mode."

AI compression risk, calibrated

Section 4 has a severity precondition: Critical only when a source was promised, claimed, fetched, or replaced. Generic unsourced model output is High/Medium provenance ambiguity. Prevents over-firing on every chatbot.

Current audits

cognitive-load

state: Frozen v0.2 + Dogfooded once
audit_prefix: CL
catches: load displacement

# 4 evidence runs:
#   pt0/                       claude.ai live
#   pt1-github-narrow/         responsive audit
#   pt2-outlook-doc-fallback/  vendor docs + reclassification
#   dogfood-1-research-os/     own tool — clean Warn

Invoke via Claude

> Run cognitive-load audit on https://your.app/dashboard

# Produces:
#   findings.md  scorecard.json  remediation-priority-list.md

Local verify tooling

Clone + install

git clone https://github.com/dogfood-lab/interface-audits.git
cd interface-audits
npm install

Run all checks

npm run verify  # schemas + links + shipcheck audit
npm run verify:schemas
npm run verify:links