Skip to content

Common failure modes

study-swarm is defined as much by what it prevents as by what it prescribes. These are the recurring failures a substantial design decision falls into, the symptom that gives each one away, and the step that catches it. Use it as a self-check while a dispatch is in flight.

Failure modeSymptomCaught byCorrective action
Fabricated citationthe arXiv id / DOI resolves to nothingStep 4, retrieval oracledrop it — there is no real source to correct
Misattributiona real paper, but the wrong author or yearStep 4, retrieval oraclecorrect the attribution and re-verify once; a second non-clean verdict drops it
Groundedness gapthe link resolves, but the source never makes the claimStep 4, groundedness lensrewrite the finding to what the source actually says, or drop it
Self-gradingthe model that synthesized the design also “verifies” itStep 4, different-family rulea verifier of a different model family, reasoning-stripped — never the generator
Postdated-paper false-flagan LLM declares a real 2026 paper “fabricated” because it postdates trainingthe retrieval-oracle requirementcheck existence by retrieval, not recall — an LLM cannot know a paper it never saw
Question paddingfive “load-bearing” questions, but only two would change a designStep 1run 1–2 agents on the questions that matter; don’t manufacture questions to hit a count
Orphan citationa finding in the grounding section that no Step-5 choice referencesStep 5connect it to a decision, or cut it — citations without a connection are noise
”Studies show…“a confident claim with no source namedthe sourcing standard / study-swarm lintname the study: author + year + a resolvable arXiv/DOI/URL
Verifier unavailable, read as “fine”the oracle or different-family model is unreachable, so the citation is kept anywayStep 4 halt tablehalt and escalate — an unreachable verifier is a closed gate, never an open one

The throughline: an unverified citation never reaches the design. Every row above is a way that rule gets quietly broken, and the step that stops it. When in doubt, the protocol’s bias is to drop or escalate, never to proceed on hope.