Ralph Wiggum Mode! -

First Impressions

Who would have thought Ralph Wiggum would become so popular! I recently started using Ralph Wiggum mode and can definitely see why—it’s incredibly powerful. I often liken different modes of working with AI to power tools, and this mode feels a bit like the first 3D printer: highly capable of building complex structures, able to run for extended periods, works well with spec-driven development, and is impressively autonomous. Some of my “prints” have failed or turned into spaghetti, but the fascinating thing about this approach is that it’s designed to learn. Retries are so inexpensive that you can simply try again.

If you haven’t read about Ralph Wiggum mode, I suggest checking out these two posts:

Why This is Exciting

There’s a lot of hype in a single paragraph, so you might wonder if all of this is trustworthy. I won’t claim it’s a fully proven method, but after personally testing it, Ralph Mode feels light years ahead of anything I’ve tried before. In two days, I built a filtered table view with six combinable states and two tabs over complex data representing AI model deployments. At the same time, I built a React meditation tracking page that connects to a Muse EEG device, streaming the EEG data to a real-time Postgres database. Each of these projects would have typically taken me weeks—possibly months—on their own, and yet I finished both in two days.

So, how did they work? Honestly, both turned out well. The code wasn’t a horrid mess (though the pull requests were massive). My plans covered several failure cases, and the implementation actually handled them better than any junior developer I’ve worked with by far.

Just Try It

Honestly, just give it a shot—try something really complex. The key is to create a rock-solid spec and watch very closely what the agent is doing. It will use all your tokens and burn through them quickly, so you want to stop it early if your spec is wrong.

How to Write a Spec File

You want to explain to the agent, in high-level terms, what you want to build. For example:

I want to add an EEG device connection workflow to my meditation page. We should ensure the headband is properly configured before allowing meditation to start. The page should show the connection state during the session and print sensor data to the page.
Study the codebase (Ultra Think), then help me write a series of spec files for this connection flow. Ask questions to clarify implementation details as needed.

This creates a series of spec files. You’ll then want to run your specs through a few different “thinking models,” asking them to also create spec files. I then have a model consolidate the specs into the best possible version, making sure there’s no code in the spec.

Review your spec carefully to ensure it makes sense and correct anything that’s off.

Plan

Next, run the Ralph loop in plan mode. This will look at your application and specs, then create a phased plan for implementing the feature. I’ll usually run the planner multiple times over the spec and code, and use different models to get the best result.

Review the plan file: often, agents will leave notes in it that you need to address, or you may catch it planning work you don’t want to do yet.

Keep It Small

Ralph is extremely powerful—the method’s creator ran it for three months and built an entire programming language. However, you likely don’t want to review a whole programming language in a single PR (and Ralph would happily make that PR). The tighter your spec, the easier quality assurance will be later.

Build

Now you just need to let the Ralph loop run. It will operate until it considers itself done (if you instruct it to), or until it hits the loop count you set. My Copilot version does this by default.

Monitor

I keep the agent open so I can watch what it’s doing. It quickly becomes clear if it’s going in the wrong direction, allowing you to save your tokens. I also leave a browser window open to watch any UI work it might be doing. Watch the implementation plan file as well—it will be updated by the agent as it works.

Malloc the Array

The method’s effectiveness, according to its creator, owes much to prompt context window size. He loves to use the analogy of mallocing the array. This likely makes little sense to some of the audience, so think of it as: you absolutely need to keep prompts small so the agent has room to work. The interesting thing about this approach is that, although the plan and the spec are large, they’re stored in files. This means the agent only needs to load the relevant part of a file to do its work—the agent’s state lives in the file, which helps since it has no memory otherwise.

The way I make sense of this is to think about recursion over a linked list. The agent works through implementation, treating each step of each phase as part of a linked list: it traverses the list of todos and grabs the next one, as prioritized by the plan phase. Since it only works on one small part of the larger project, it can keep the prompt small when loading the codebase context and spec information. This approach also makes search more efficient since it can focus on a smaller task.

One problem I struggled with before this approach was the agent’s prompt getting overloaded over a session, eventually dropping important instructions (like how to run tests). I’ve found this doesn’t happen nearly as often with Ralph mode, and when it does, it’s less of a concern, since the harness will re-prompt with the entire agent file in a future loop and things will eventually run as expected.

Eventual consistency is a big idea in Ralph. Since the loop is iterative, you’ll see it checking its own work repeatedly. It can seem a bit “anxious,” but it does catch things it missed—and then fixes them or updates the plan.

Self-Improvement and Self-Updating Agents

A standout feature of Ralph Mode is its ability to improve itself. When Ralph encounters incomplete instructions or uncertainty, it updates its own memory (“agents”) and plan by recording new notes or clarifying questions for future loops. This isn’t just a static process—the agent is actively building a project-specific knowledge base as it works.

The result is a feedback loop: Ralph learns from gaps or mistakes and captures new information as it goes, making it less likely to repeat errors. Over time, these self-updating agents make the process more adaptive and reliable, getting smarter and more effective the longer Ralph runs.

My Improvements

One feature I really want is the ability to see why an agent did something—and for the agent to see its own reasoning as well. I updated my version of Ralph to use Git Notes. Notes are like commit messages, assigned to commit hashes, but are designed to be much longer. I’m dumping verbose agent logs into Notes so that I can see an agent’s thinking later. Agents can also see what agents on other branches are doing, much like session storing. Since these are files, it’s also convenient, because agents can search through them instead of pulling the entire note into the prompt.

Ralph Mode with GitHub Copilot

I’ve used Ralph mode with Claude and wanted to build a Copilot-compatible version, as my work account gives me access to more tokens (and Ralph does burn a lot!). You can try it with:

Copilot: My fork of the Ralph loop
Claude: Original version