Chapter 9: Stop Rules + Pitfalls: When to Upgrade, Bail, or Go Manual

Series: LLM Development Guide

Chapter 9 of 15

Previous: Chapter 8: Security & Sensitive Data: Sanitize, Don’t Paste Secrets

Next: Chapter 10: Measuring Success: Solo + Team Metrics Without Fake Precision

What you’ll be able to do

You’ll be able to avoid the two common failure outcomes:

Spending hours fighting the model.
Shipping output you can’t review.

You’ll do it with explicit stop rules, upgrade triggers, and a short recovery checklist.

TL;DR

If the change is under a minute manually, do it manually.
If you can’t review the output competently, don’t ship it.
If you’re on your third attempt for the same logical unit, upgrade or re-scope.
Add verification steps to plans and prompts so “done” is testable.

Stop rules
Top pitfalls
Recovery checklist
Verification

Stop rules

These are pragmatic defaults. Tune them to your environment.

Stop rule 1: tiny changes

If it is a tiny change (one line, one rename, one version bump), do it manually.

LLM overhead is real:

You still have to explain.
You still have to review.
You still have to verify.

Stop rule 2: you can’t review it

Never commit code you could not explain in a review.

If you don’t understand the domain:

break the work into smaller pieces you can understand, or
involve a reviewer who does.

Stop rule 3: you’re fighting output quality

The 10-minute rule:

If you’ve spent about 10 minutes fighting the output, stop.
Upgrade the model tier, or shrink the scope to a smaller logical unit.

Stop rule 4: high-risk code needs extra caution

Be cautious with:

Authentication and authorization.
Cryptography.
Payment flows.
Input validation.

You can still use LLMs, but the bar for review and verification is higher.

Top pitfalls

These show up repeatedly.

Trusting output without review.
Skipping planning.
Not providing reference implementations.
Letting sessions run too long.
Scope creep mid-session.
Vague prompts.
Not capturing decisions.
No verification step.

A simple rule:

If you wouldn’t merge a junior developer’s PR without review, don’t merge LLM output without review.

Recovery checklist

When things go wrong:

Stop iterating on bad output.
Decide what kind of problem it is:
- prompt problem,
- model capability problem,
- task is a poor fit.
Simplify:
- smaller logical unit,
- more references,
- clearer constraints.
Fresh session if context has drifted.
Manual fallback is a valid outcome.

Verification

Create a one-page stop-rules file so you can apply this consistently across tasks:

mkdir -p work-notes

cat > work-notes/stop-rules.md <<'MD'
# Stop Rules (Personal Defaults)

## Manual first
- If change is <= 1 minute manually, do it manually.

## Upgrade triggers
- Third attempt on same logical unit.
- Repeated misunderstandings.
- Output ignores constraints.

## Bail triggers
- I cannot review this competently.
- Task requires live debugging with runtime state.
- Sensitive data would be required to reproduce.

## Required gates
- Verification commands exist in plan.
- Verification commands exist in prompt.
- Work notes updated before continuing.
MD

Expected result:

You have a written policy you can apply without debating every time.

Continue -> Chapter 10: Measuring Success: Solo + Team Metrics Without Fake Precision

Llm Software-Engineering Workflow Agents

Authors

Roy Gabriel

DevOps Architect · Applied AI Engineer

I’ve spent 20 years building systems across embedded firmware, security platforms, fintech, and enterprise architecture. Today I focus on production AI systems in Go — multi-agent orchestration, MCP server ecosystems, and the DevOps platforms that keep them running. I care about systems that work under pressure: observable, recoverable, and built to last.

← Go vs Spring Boot for Enterprise APIs: Cost, Performance, and Cloud-Native Ops February 1, 2026

MCP Servers in Production: Hardening, Backpressure, and Observability (Go) January 31, 2026 →