study-swarm
study-swarm is a protocol, not a tool. When you make a substantial design decision with an LLM — a new product layer, an architecture choice, a “should we trust the model here” call — improvising from first principles ships designs that are stale, and citing papers from memory ships designs that rest on sources that don’t exist or don’t say what you think.
study-swarm replaces both with a disciplined loop: dispatch parallel research agents, demand specific cited findings, and gate every citation through an external verifier of a different model family before it informs the design.
It applies its own medicine. The protocol prescribes verifier-protected envelopes for the systems it helps design — so it runs one on itself. No model grades its own homework, including the one running the protocol.
The protocol in five steps
Section titled “The protocol in five steps”- Identify 3–5 load-bearing questions where empirical evidence would change the answer.
- Dispatch one research agent per question, in parallel — cited findings only.
- Synthesize the findings into a Research grounding section.
- Verify externally — a different model family, reasoning-stripped, checks every citation.
- Connect each architectural choice back to a finding by number.
Where to go next
Section titled “Where to go next”- The five steps — the locked execution shape, in detail.
- The verification gate — Step 4: the two-stage check, the halt table, and why a different family.
- Research grounding — the evidence behind the design, and the proof it works.
- Running it — by hand, or with
roleos verify-citations. - Common failure modes — the failures the protocol exists to catch, and the step that catches each.
When to reach for it
Section titled “When to reach for it”Fire study-swarm when any of these hold:
- a decision introduces a new product layer (not a fix or operational tuning);
- the decision is qualitative — “trust the model here?”, “explain or just do?”, “cap options?”, “retry or fall back?”;
- you’re about to recommend a single-axis answer (deterministic-only / LLM-only) where the real answer is multi-axis;
- an adjacent domain (compilers, SRE, databases, mixed-initiative HCI) has likely already solved it.
Skip it for pure fixes, scope extensions of already-grounded work, and operational tuning (“what number,” not “what shape”).