When to use ralph loops¶
Ralph loops are powerful but they're not the right tool for everything. This page helps you decide whether a loop fits your task before you invest time setting one up.
The sweet spot¶
Ralph loops work best when a task has these properties:
Decomposable into small, independent steps. The loop does one thing per iteration. Tasks that naturally break into "do this, then this, then this" are ideal — implementing features from a TODO list, writing tests module by module, fixing lint errors one at a time.
Has a clear definition of "done" for each step. Commands need something to validate. If you can express "this iteration succeeded" as a command that exits 0 or non-zero (tests pass, build succeeds, lint is clean), the self-healing loop works. If success requires human judgment ("does this look good?"), a loop can't self-correct.
Benefits from fresh context. Long conversations degrade — the agent loses track of earlier instructions, fills up the context window, and starts making mistakes. If your task will take more than 15-20 minutes of agent work, a loop outperforms a single conversation because each iteration starts clean with the current state of the codebase.
Progress is visible in the codebase. The agent's work must be observable on disk — files changed, tests added, docs written, commits made. The next iteration reads the codebase to understand what's been done. Tasks that produce output elsewhere (a Slack message, a deployment, an email) need a wrapper that records progress locally.
What works well¶
| Task | Why it fits |
|---|---|
| Implementing features from a spec | Each feature is one iteration; test commands validate correctness |
| Writing tests | Each module is one iteration; coverage commands guide prioritization |
| Fixing lint / type errors | Each fix is small and independently verifiable |
| Documentation improvements | Each page is one iteration; build commands validate |
| Codebase migrations (JS to TS, Python 2 to 3) | Each file is one iteration; the compiler validates |
| Bug triage | Each bug is one iteration; regression tests verify the fix |
| Refactoring | Each extraction/rename is one iteration; tests catch regressions |
What doesn't work well¶
| Task | Why it doesn't fit |
|---|---|
| Design decisions | Requires human judgment about trade-offs — no command can validate "is this the right architecture?" |
| Tasks requiring multi-step reasoning across iterations | Each iteration starts fresh — the agent can't "continue where it left off" from memory, only from what's on disk |
| One-shot tasks | If the task takes 5 minutes and you won't repeat it, just chat with the agent — the loop setup overhead isn't worth it |
| Tasks with no automated validation | Without command feedback, there's no self-healing — the agent may compound errors across iterations |
| Creative writing | Prose quality is subjective; no command can validate "is this well-written?" |
| Interacting with external services | API calls, deployments, and messages are hard to undo if the agent makes a mistake |
Loop vs. single conversation¶
Use a single conversation when:
- The task will take less than 10-15 minutes
- You want to iterate interactively with the agent
- The task requires back-and-forth discussion
- You need to make subjective decisions along the way
Use a ralph loop when:
- The task involves many similar, independent steps
- You want the agent to work autonomously without your attention
- You have commands that can validate correctness (tests, linters, builds)
- The task would fill up a conversation's context window
- You want to walk away and come back to completed work
Making a borderline task work¶
Some tasks seem like they don't fit but can be adapted:
"There's no automated validation for this." Write a command for it. Even a simple script that checks for obvious problems (file exists, no syntax errors, word count above threshold) catches the worst failures. Add it to your commands list and reference it in the prompt.
"The task requires multi-step reasoning." Use a PLAN.md or TODO.md file as the coordination mechanism. The agent reads the plan each iteration, marks steps done, and the next iteration continues from there. The plan file IS the agent's memory.
"Each iteration depends on the previous one." That's fine — the agent reads the codebase, which includes all previous iterations' commits. As long as progress is visible on disk, the fresh context model works. The agent doesn't need conversation memory when the code tells the story.
"I need to review the agent's work before it continues." Use -n 1 to run single iterations, review, then run again. Or use --stop-on-error with a command that checks for your sign-off.
How many iterations?¶
- Start with
-n 3to verify your setup works and the agent produces useful output - Use
-n 10-20for bounded tasks (a TODO list with known items) - Run unlimited (
ralph run my-ralphwithout-n) for open-ended improvement tasks with good command feedback — the commands prevent the agent from going off the rails - Use
--stop-on-errorwhen each iteration must succeed before the next one makes sense
Next steps¶
- Getting Started — set up your first loop
- Writing Prompts — patterns for effective autonomous loop prompts
- Cookbook — copy-pasteable setups for common workflows