SHOAL STATION · ONLINE
Λ
Laconian Engineeringhomelab · systems · signal
← back to log
2026-04-13· 6 min read

The Oracle Protocol

I asked Claude to review its own code. It said the code was fine. The code was not fine.

aiworkflowclaude-codecodex

I asked Claude to review its own code.

It said the code was fine. It wasn't. I found the bug three days later, when I actually ran the thing and watched it fall over. The review had been cheerful. Detailed. Wrong.

That's when I stopped pretending self-review was review.

It wasn't one incident, either. It had been happening for weeks and I'd been ignoring it — I'd ask Claude to check its own work, I'd get back a review that sounded thoughtful, I'd ship, and then I'd find the problem myself sometime later. The reviews weren't catching anything because the reviews weren't independent of the work.


The Writer Is Not the Reader

The problem is obvious once you say it out loud. If you ask a model to check its own output, the model that just produced the output is the same model doing the checking. Whatever assumptions shaped the writing are still in there when it reads. It already thinks the code is correct — it just wrote it.

Calling that review is generous. It's closer to the model nodding at itself.

I needed a different reader. Not a more careful pass from the same model — that was just the same conversation with a fresh prompt. I needed another model entirely, trained on different stuff, with different habits about what it flags and what it lets slide.


What I Actually Run

Two agents on my dev VM. Claude Code writes. Codex CLI reviews. They don't swap, which I'll come back to.

The workflow I settled on goes like this:

  1. Plan. Claude drafts an approach before it writes anything. What files it'll touch, what the risks are, what's in scope.
  2. Critique. Claude runs some internal critique agents over the plan — VP of QC, Lead Engineer, whatever role makes sense. This is still self-review and I know it, but critiquing a plan is cheap and it catches the stupid stuff before any code exists.
  3. Read before write. Claude reads every file it's about to touch. Not skim. Read. This rule alone fixed half the problems I was having.
  4. Write. Claude produces the change.
  5. Adversarial review. The diff goes to Codex with instructions to attack it — correctness, security, performance, clarity, conventions. If it's clean Codex replies REVIEW_CLEAN. If it isn't, every issue comes back as - ISSUE: <description>.
  6. I approve. A human in the loop, always.
  7. Ship. Commit, push, rebuild.

Nothing hits GitHub until Codex returns clean and I've signed off.


The Role Discipline

Writer and reviewer don't swap. That's the part I had to learn the hard way.

It's tempting, when Codex is busy or its sandbox is broken or I'm in a hurry, to just ask Claude to review its own work as a one-off. I've done it. It always sounds reasonable in the moment and it never actually works — Claude produces a review that agrees with what Claude already did, and I'm back where I started.

Same thing going the other direction. If I let Codex write production code once, now there's a precedent, and the next time I'm annoyed with Claude it's easier to hand the work across, and pretty soon the whole protocol is sitting on the shelf because every individual exception made sense at the time.

Writer writes. Reviewer reviews. When I have to break the rule I say so out loud, to myself, and I treat the output accordingly.


Why Two Different Models

The obvious question is why not two Claudes, or two Codexes. The answer I landed on after trying both: same model, same instincts. When I ran two Claude instances against each other they'd agree more than I wanted them to — they'd been shaped by the same training, they had the same taste, they liked the same abstractions and had the same suspicions. The second Claude tended to sound like the first Claude nodding.

Claude Opus 4.6 and Codex (running GPT-5.4 at the moment) don't nod at each other. They disagree about abstractions, about how much validation is too much, about whether a name is ugly enough to push back on. Most of those disagreements aren't the model being clever — they're just two different defaults colliding, and the collision is where I actually get a second opinion.

If they always agreed I'd only need one of them.


The Bubblewrap War (Briefly)

The protocol nearly died in its first week because of a Linux primitive I'd never thought about in my life.

Codex runs its reviews inside a read-only sandbox. The sandbox is built on bubblewrap, which on startup tries to create a loopback network interface. On my dev VM it couldn't — the VM is itself a container, and it doesn't have the kernel capabilities to let nested containers configure their own networking. Every review command died before the review ran.

I'll write up the fix in its own post. What I took away from the incident was simpler and a little embarrassing: I'd designed a workflow at the level of "two agents, two roles," and watched it collapse because of a kernel capability I hadn't known existed until that afternoon. The protocol was sitting on a stack I hadn't built, and the stack had its own opinions.


This Is Not About AI Purity

I didn't build this because I think LLM code is dangerous and has to be gated. I built it because I was shipping sloppy work and blaming the model for it, when the real problem was that I'd stopped reading diffs myself and there was nobody else reading them either.

Plan the work. Read the files. Have somebody else look at it. Sign off because you actually looked, not because you defaulted to yes.

It's the same discipline I'd want from any engineer on a team. I'd been letting it slide the moment the engineer was made of tokens, which in retrospect is the exact moment I should have been tightening it.

The fix turned out to be two agents, two roles, and a rule that they don't trade hats.