The Loop Needs a Harness

Three voices - Claude Code's Boris Cherny, Addy Osmani, and Thoughtworks' Birgitta Boeckeler - point at the same shift: stop prompting AI, start writing loops. But a loop needs a harness. This is why I'm building Shipwright, and why iteration, not the first build, is where it earns its keep.

Date

June 9, 2026

Over the past months I've been building Shipwright, and like anyone building something, I've struggled to put it in one clean sentence. Then I read two articles that gave me clearer language for it than I had myself. Neither one mentions Shipwright. But between them they name two ideas - the loop and the harness - that I think sit close to the centre of letting an AI agent build real software. One is about speed. The other is about being able to trust it.

The first is Addy Osmani's "Loop Engineering," which he published just yesterday, on 8 June 2026. The second is Birgitta Böckeler's "Harness Engineering for Coding Agent Users," on Martin Fowler's site, from back in April.

The loop: when you stop holding the tool

Osmani's argument is simple and, once you see it, hard to unsee. Most of us still use coding agents the way we used autocomplete: we sit there, prompt, read, prompt again. That works, but it doesn't scale, because the bottleneck is us. "You design the system that does it instead," he writes. A loop, in his words, is "a recursive goal where you define a purpose and the AI iterates until complete."

He breaks it into five building blocks - automations that trigger work on a schedule, worktrees that isolate parallel branches so agents don't collide, skills that capture project knowledge once and reuse it, plugins and connectors that reach into the real tools, and sub-agents that check the work of other agents. Plus memory, because the model forgets between runs and the repo doesn't.

Osmani is a director at Google and spent close to fourteen years leading developer experience on Chrome. When someone with that vantage point says the unit of work is shifting from the prompt to the system, it's worth slowing down. A fair amount of it lined up with things I'd been fumbling toward, without having had the words for them.

You can hear the same shift from the people building the tools. Boris Cherny, the creator of Claude Code put it about as plainly as it gets in a recent interview: "I don't prompt Claude anymore. I have loops that are running. They're the ones that are prompting Claude and kind of figuring out what to do. My job is to write loops." He mentioned he'd uninstalled his IDE months earlier, because he simply wasn't opening it any more. When the person who built the coding agent says his job is now writing the loops around it, the shift stops sounding like a prediction.

The part the loop doesn't solve

Here's where the article earns its keep. Osmani doesn't sell the loop as a free lunch. "A loop running unattended is also a loop making mistakes unattended," he writes. The faster it ships, the easier it is to accumulate what he calls comprehension debt - code you own but no longer understand. And the quiet danger is cognitive surrender: using the loop to avoid thinking rather than to think bigger.

His closing line makes it clear: "Build the loop. But build it like someone who intends to stay the engineer, not just the person who presses go."

A loop that makes you faster but dumber is a bad trade. The real question is what you put around the loop so that speed and understanding move together - and that's the gap Böckeler's article fills.

The harness: making the loop trustworthy

Böckeler, a Distinguished Engineer at Thoughtworks, takes a shorthand that has been circulating this year - Agent = Model + Harness - and works out what it means in practice. Everything in an AI agent that isn't the model itself is the harness. The model is non-deterministic, doesn't really know your context, and has, in her words, "no aesthetic disgust at a 300-line function, no intuition that 'we don't do it that way here.'" The harness is how you make its output trustworthy anyway.

She splits the harness into two kinds of control. Guides are feedforward: they steer the agent before it acts - conventions, constraints, system prompts. Sensors are feedback: they observe the result and let the agent self-correct - tests, linters, reviews. And she draws a second line, between computational checks (deterministic, fast, cheap - a type checker runs in milliseconds and is always right) and inferential ones (an AI reviewer, semantically rich but slower and fallible). A good harness uses both, and it keeps quality "left" - the earlier you catch an issue, the cheaper it is to fix.

Put the two articles side by side and the division of work is clear. The loop decides what to do and keeps the work moving. The harness decides whether what comes out is something you can trust. The loop is the engine; the harness is the steering and the brakes. A loop without a harness is fast and dangerous. And the discipline the harness adds would be overhead if you did it by hand - but it runs automatically, on every change, so the part that used to slow teams down just happens.

Where Shipwright fits: the harness

I'll be concrete, because the whole point is that this isn't theory.

Shipwright is the harness half. And the part of it I'd point to first isn't the big, first-time pipeline - it's the changes that come after. The greenfield run that takes a project from requirements to a first deploy happens once. The actual day-to-day is iterate - one change at a time, each small enough to genuinely understand before the next one. That everyday loop is where the harness does most of its work, and it's the part I care about the most.

So here's what a single iterate change passes through, in Böckeler's vocabulary:

Guides (feedforward). Every project carries a constitution - a short list of ALWAYS, ASK FIRST, and NEVER rules. Always run tests before committing. Ask first before a destructive database migration or a production deploy. Never force-push to main. The agent is steered before it acts, not just judged after.

Sensors, the computational kind. Quality gates are hooks that fire on real events and block with a hard exit code. A commit that pushes a file past its size baseline is refused. A secret in source is refused. A destructive migration without a rollback script is refused. As the docs put it, "quality doesn't depend on the agent remembering the rules." That's the difference between a guideline and a guardrail.

Sensors, the inferential kind. A non-trivial change runs through a review cascade: a spec-reviewer that hard-gates whether the code matches the requirement, then a code-reviewer for quality, then - for risky touches like migrations or async code - a doubt-reviewer that arrives with fresh context and tries to prove the change wrong. Three lenses, because one reviewer, human or AI, has blind spots.

And underneath all of it: traceability. Every requirement gets an ID, mapped to the spec, the file that implements it, the test that covers it and the changelog entry. The audit artifacts - traceability, test evidence, change history, a software bill of materials - ship as a byproduct of building, not as a separate project before a release.

Here's what I want to be honest about: none of this makes the work faster. Every one of these steps costs time the loop would happily skip. That's the point. The harness is the part that insists each change is done cleanly rather than just quickly - that an iterate is a small, understood, reversible change with a test and a trail behind it, not merely a diff that happened to pass. It's also why I keep returning to Osmani's line about the engineer's mindset. The harness isn't there to remove us; it's there to move our attention to the decisions that actually need a human, and make the mechanical stuff mechanical.

→ Explore Shipwright

The loop I'm building toward: Leadwright

If Shipwright is the harness, what drives it? For now, mostly me - I still decide what gets worked on next. The piece I'm building to close that gap is Leadwright, a sister project. Where Shipwright governs how a change is made, Leadwright is about what gets made and when: triage, priorities, and "leads" that pull work from a backlog and run it through the harness. That's the loop in Osmani's and Cherny's sense: the system that decides and drives, so the harness has a steady supply of work to run on.

I'll be honest about where it stands. It's early. The foundation is built and tested - around 1'500 lines of carefully scoped TypeScript, 107 of 107 unit tests passing - and the parts that make it run on its own, the scheduler and the agent skill, are still ahead of me. But the detail I like is already true: Leadwright's foundation was built through Shipwright's iterate workflow. Same specs, same test-first build, same review cascade, same trail. The harness building the loop. That's the compounding I was hoping for when I started - each change leaves the next one a little easier to make cleanly.

Staying the engineer

This week I'm hopefully launching Shipwright for early access. The timing of those two articles felt like a gift, because between them they put words to the thing I most wanted to get across.

The loop is what everyone is excited about right now, and for good reason - it's a real step change in how fast you can move. But the half I'd argue matters more is the unglamorous one: the harness, and inside it, the daily discipline of iterating in small, understood steps rather than the one big build. Speed without trust is just risk with better marketing.

The harness is what lets me build something I'd put my name on: it makes explicit the discipline a seasoned engineer would otherwise carry in their head. As Osmani put it, build the loop, but build it like someone who intends to stay the engineer, not just the person who presses go.

Ship right, not just fast.

Sources:

Addy Osmani: "Loop Engineering" - Elevate (Substack) - 8 June 2026 - https://addyo.substack.com/p/loop-engineering
Boris Cherny: Interview on Loops and the future of coding - Acquired (Unplugged, presented by WorkOS), YouTube - 2026 - https://www.youtube.com/watch?v=RkQQ7WEor7w
Birgitta Böckeler: "Harness engineering for coding agent users" - martinfowler.com - 2. April 2026 - https://martinfowler.com/articles/harness-engineering.html
Thoughtworks: "What is harness engineering?" (Technology Podcast) - thoughtworks.com - 2026 - https://www.thoughtworks.com/insights/podcasts/technology-podcasts/what-harness-engineering
Addy Osmani: Biography and publications - addyosmani.com - https://addyosmani.com/