
Routing map
Do not send every question down the same lane.
Multi-LLM routing is the practice of matching a task to the model behavior, tool access, context budget, and review lane that fit the work. It turns model choice from preference into a documented operating decision.
Low risk
Transform
Use fast lanes for formatting, tone variants, extraction, and drafts where a human or source pass remains obvious.
Medium risk
Compare
Use two-lane review when the work needs judgment, disagreement discovery, or alternatives before a route is selected.
High risk
Escalate
Move to source-first research, domain review, or manual verification when wrong confidence would carry real cost.
Routing questions worth asking before the prompt
Does the answer require current facts or can it rely on stable knowledge?
Will the reader need citations, provenance, or a reproducible trail?
Is the task mainly transformation, reasoning, retrieval, coding, synthesis, or critique?
What is the cost of a polished but unsupported answer?
Should the model produce a final answer or a checklist for a human reviewer?
Where will this output live after the session ends?