Chapter 7: Choosing the Right Model: Capability Tiers, Not Hype

Series: LLM Development Guide
Chapter 7 of 15
Previous: Chapter 6: Scaling the Workflow: Phases, Parallelism, Hygiene
Next: Chapter 8: Security & Sensitive Data: Sanitize, Don’t Paste Secrets
What you’ll be able to do
You’ll be able to pick a model and interface deliberately:
- Use capability tiers instead of memorizing brand names.
- Upgrade quickly when quality is the bottleneck.
- Avoid wasting flagship models on structured boilerplate.
TL;DR
- Treat model choice as a cost-of-mistakes problem.
- Use flagship models for planning, debugging, and high-stakes decisions.
- Use mid-tier models for implementation with strong references.
- Use fast/cheap models for boilerplate and simple transformations.
- If you’ve spent ~10 minutes fighting output quality, upgrade or shrink scope.
As-of note
As of 2026-02-14, model names, pricing, and product policies change frequently. Prefer tier-based guidance, and verify vendor policies directly before using tools with sensitive data.
Table of contents
The capability tiers
Think in tiers:
- Flagship: best reasoning and instruction-following for novel work.
- Mid-tier: strong general performance for structured work with references.
- Fast/cheap: good for simple tasks, higher error rate on complex reasoning.
This framing stays useful even when names change.
Task-to-tier mapping
Use flagship for:
- Planning and architecture.
- Debugging complex failures.
- Security-sensitive review.
- Anything where mistakes are expensive.
Use mid-tier for:
- Implementation that follows existing patterns.
- Refactors with clear examples.
- Writing tests when the behavior is already defined.
Use fast/cheap for:
- Syntax lookups.
- Boilerplate you will review.
- Mechanical transformations.
Red flags: upgrade now
Upgrade when you see:
- The model repeats the same misunderstanding.
- Output ignores constraints.
- “Looks right” code fails in tests.
- You are on the third prompt iteration for the same unit.
The cheapest model is the one that gets you to a correct verified change with the least total time.
A selection checklist
Before you start, answer:
- Is this novel or pattern-following?
- Do I have reference implementations?
- What is the cost of mistakes?
- Is this structured or ambiguous?
- Am I debugging or implementing?
If uncertain:
- Start with flagship for planning.
- Drop to mid-tier once you have a stable pattern and good references.
Verification
A practical way to keep this from being hand-wavy is to force a written decision per phase.
Create a small note file per task:
mkdir -p work-notes
cat > work-notes/model-selection.md <<'MD'
# Model Selection (Per Task)
## Task
<What are we doing?>
## Risk
- Cost of mistakes:
- Can I review the output competently?
## References
- <Paths to reference implementations>
## Model decision
- Tier: <flagship|mid-tier|fast>
- Why:
- When to upgrade:
## Outcome
- Did we upgrade?
- What broke / what worked:
MD
Expected result:
- You can justify the model choice in one minute.
- You have a trigger for upgrading when output quality is the bottleneck.
Continue -> Chapter 8: Security & Sensitive Data: Sanitize, Don’t Paste Secrets