Latest Article

The Maturity Paradox: Harness Engineering Goes Mainstream While the Plumbing Still Leaks

Martin Fowler published a piece on harness engineering yesterday. For those watching this space, that’s the sound of a paradigm crossing the chasm from early adopters into mainstream engineering discourse. When someone of Fowler’s stature writes about your concept, it’s not experimental anymore—it’s something practitioners are expected to understand. The timing is interesting. We’re seeing production benchmarks like AEC-Bench emerge with “full agent harness” as a first-class requirement for evaluation. Cursor shipped version 3 with a complete interface redesign around agent autonomy rather than IDE workflows. Claude Code’s infrastructure is straining under rate limit pressure as users shift from “assistance”...
Read full article →

Previous Articles

Harness Engineering Goes Mainstream

Martin Fowler published about harness engineering today. That’s not just another blog post—when Fowler writes about a pattern, it means it’s crossed from experimental to essential. The post by Birgitta...

Six Weeks from Concept to Consensus

Something unusual happened in February 2026. Mitchell Hashimoto mentioned “harness engineering” while building Ghostty, and within weeks, both OpenAI and Anthropic had independently adopted the term. By March, it’s everywhere—LangChain...

Environment Design Over Agent Perfection

There’s a conceptual shift happening in agent development that I’ve been watching crystallize over the past week: the move from “perfecting the agent” to “designing the environment.”

The Harness Bottleneck

Something shifted this week in how people talk about agentic coding performance. It wasn’t a new model release or benchmark improvement—it was a realization about where the real constraints live....

The Hidden Tax of Model-Specific Memory

Everyone’s talking about memory as the unlock for coding agents—the gap between 80% and 95% effectiveness isn’t better models, it’s persistent context across sessions. But there’s a Penn State research...

The Plumbing Layer: What Makes Agents Actually Run

While Twitter debates which coding model is best and whether Cursor beats Claude Code, AWS quietly shipped something more important: persistent session storage for agent filesystem state. It’s the kind...

Where Does the Harness End?

There’s a clean mental model floating around Twitter lately: Agent = Model + Harness. AlterPKC broke it down nicely—the model is the intelligence, the harness is everything else (filesystems, sandboxes,...

Vertical Integration as the New Moat

Cursor announced their own frontier model this week — Composer 2 — and the timing feels significant. Not because it’s technically surprising (everyone saw this coming), but because of what...

Memory as Liability: When Remembering Makes Things Worse

The agent memory conversation has taken a fascinating turn this week. While everyone’s been racing to build better memory systems—longer context windows, more sophisticated RAG pipelines, knowledge graphs—some unexpected findings...