📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A recent Google whitepaper reveals that in AI development, the model accounts for only 10% of system behavior. The key to success lies in harness engineering and context management, not just the AI model itself.

A new Google whitepaper, “The New SDLC With Vibe Coding,” emphasizes that the AI model constitutes only about 10% of system behavior. The report argues that the real challenge and value lie in designing the harness, verification, and context management surrounding AI models, marking a shift in AI development priorities.

The whitepaper, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, states that 85% of professional developers use AI coding agents regularly, with 41% of new code being AI-generated. However, it emphasizes that the model itself is only a small part of the system, with the harness—comprising prompts, tools, rules, and observability— accounting for approximately 90% of behavior.

Concrete examples include experiments where changing only the harness or prompts significantly improved agent performance, despite using the same underlying model. The authors argue that failures are often due to configuration issues, missing tools, or vague rules, not the model’s capabilities. This shifts the strategic focus toward building durable, configurable scaffolding rather than chasing the latest model advancements.

At a glance

reportWhen: published early 2026

The developmentGoogle’s new whitepaper highlights that the core of effective AI systems is not the model but the harness and verification processes, shifting focus in AI development.

The Model Is Only 10% — The New SDLC With Vibe Coding

AI Dispatch · Field Notes

Google · Osmani, Saboo & Kartakis · May 2026

The model is only 10%

A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.

A spectrum, not a binary — the differentiator is how outputs get verified

Vibe Coding

Casual prompts · “does it seem to work?” · disposable code · high risk

Structured AI-Assisted

Detailed prompts + constraints · manual testing · features in real codebases

Agentic Engineering

Formal specs · automated tests + evals + CI gates · production scale · low risk

Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.

The idea worth building your strategy around

Agent = Model + Harness

~10%

HARNESS — prompts · tools · context · hooks · sandboxes · observability

MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S

Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.

“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.

The economics: it’s a token-cost problem (CapEx vs OpEx)

Vibe Coding

Low CapEx · High OpEx

Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.

Agentic Engineering

High CapEx · Low OpEx

Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.

85%

of devs use AI coding agents (51% daily)

41%

of all new code is AI-generated

~90%

of agent behavior is the harness, not the model

+19%

longer on some tasks (METR) — verification is the cost

The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.

thorstenmeyerai.com

Why Harness Engineering Outweighs Model Selection

This shift in perspective matters because it suggests that long-term competitive advantage in AI systems depends more on how organizations design, configure, and verify their AI environments than on acquiring the newest models. It also implies that cost management—including token economy, maintenance, and security—becomes central to AI strategy, as the harness is where most of the value and control resides.

For organizations, this means investing in robust scaffolding, context engineering, and verification processes can lead to more reliable, secure, and cost-effective AI deployments, rather than focusing solely on model upgrades.

Harness Engineering for AI Coding Agents: Build Reliable Claude Code, Codex, and Python Agent Workflows with Guardrails, Tests, CI Gates, and Production Controls (AI Agents & MCP Series)

As an affiliate, we earn on qualifying purchases.

Evolution of AI Development Focus

Prior to 2026, AI development heavily emphasized acquiring and fine-tuning large models, with success often measured by model size and performance benchmarks. The whitepaper reflects a maturing understanding that model quality is only part of the equation. Recent experiments, such as moving a coding agent from outside the top 30 to the top 5 by only adjusting the harness, demonstrate that configuration and context management are more impactful than model upgrades.

This perspective aligns with broader industry shifts toward agentic engineering, where structured scaffolding, verification, and context loading form the core of effective AI systems.

“The model is something like 10% of what determines behavior; the harness is the other 90%.”
— Addy Osmani

LEAN PROGRAMMING FOR FORMAL SOFTWARE VERIFICATION: Mathematical proof systems and logical frameworks for verified computation

As an affiliate, we earn on qualifying purchases.

What Aspects of the Harness Remain Unclear

While the whitepaper underscores the importance of harness and context engineering, it does not specify best practices for building these systems at scale or how organizations should prioritize investments. The precise impact of different harness components across varied use cases remains to be fully validated in practice.

AI-Powered Observability: From Noise to Insight: Transforming How We Monitor, Detect, and Respond

As an affiliate, we earn on qualifying purchases.

Next Steps for AI System Optimization

Organizations are likely to focus on developing and refining their harness architectures, including better tools for context management, verification, and observability. Further research and case studies will clarify which configurations yield the best ROI and how to standardize best practices in harness design. Additionally, expect industry shifts toward training teams in context engineering and integrating these skills into AI development workflows.

The AI Prompt Playbook: Master AI Prompt Engineering with 140 Ready-to-Use Templates for ChatGPT, Claude, Gemini & Copilot

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is the model only 10% of the system’s behavior?

The whitepaper explains that most of the AI system’s behavior depends on how the model is integrated, configured, and verified. The harness—including prompts, tools, and rules—shapes the output far more than the raw model itself.

How can organizations improve their AI systems based on this insight?

Focusing on building robust harnesses, managing context effectively, and establishing verification processes can lead to more reliable and cost-efficient AI deployments, rather than solely upgrading models.

Does this mean model development is less important?

Not necessarily less important, but the whitepaper suggests that model improvements alone are insufficient. System design, configuration, and verification are equally, if not more, critical for achieving desired outcomes.

What are the risks of focusing too much on harness engineering?

Over-investing in scaffolding without understanding the core AI capabilities may lead to complexity and maintenance challenges. Balancing model development with system engineering remains essential.

Will this shift change AI development practices industry-wide?

It is likely, as organizations recognize that effective AI systems depend heavily on engineering discipline, leading to new roles and skillsets focused on system configuration and verification.

Source: ThorstenMeyerAI.com

Nothing in this article is financial or investment advice. Cryptocurrency and precious-metal investments carry significant risk — do your own research and consider a licensed advisor.

The Model Is Only 10%: The Real Lesson of the New SDLC

Up next

Cutrova: Edit the Words, Not the Timeline

Author

DreamRidiculous Team

Share article

The model is only 10%

Why Harness Engineering Outweighs Model Selection

Harness Engineering for AI Coding Agents: Build Reliable Claude Code, Codex, and Python Agent Workflows with Guardrails, Tests, CI Gates, and Production Controls (AI Agents & MCP Series)

Evolution of AI Development Focus

LEAN PROGRAMMING FOR FORMAL SOFTWARE VERIFICATION: Mathematical proof systems and logical frameworks for verified computation

What Aspects of the Harness Remain Unclear

AI-Powered Observability: From Noise to Insight: Transforming How We Monitor, Detect, and Respond

Next Steps for AI System Optimization

The AI Prompt Playbook: Master AI Prompt Engineering with 140 Ready-to-Use Templates for ChatGPT, Claude, Gemini & Copilot

Key Questions

Why is the model only 10% of the system’s behavior?

How can organizations improve their AI systems based on this insight?

Does this mean model development is less important?

What are the risks of focusing too much on harness engineering?

Will this shift change AI development practices industry-wide?

The Enforcement Countdown: 89 Days Until the EU AI Act’s GPAI Penalty Phase Begins

When a Content Network Starts Publishing to Itself

Your Coding Agent Is an Attack Surface: The Claude Code Security Reckoning

Build vs Buy a Prebuilt AI Workstation

Abu Dhabi’s Mubadala Capital Joins Tokenization Push As Coinbase Takes Stake In Onchain Fund

8 Best Mini PCs for Day Trading in 2026

Exploring AI-Generated Elements In ‘Kanton Alpin Verkehrsbetriebe’

Bitcoin Up Or Down On July 23?

The Model Is Only 10%: The Real Lesson of the New SDLC

Up next

Author

DreamRidiculous Team

Share article

The model is only 10%

Why Harness Engineering Outweighs Model Selection

Harness Engineering for AI Coding Agents: Build Reliable Claude Code, Codex, and Python Agent Workflows with Guardrails, Tests, CI Gates, and Production Controls (AI Agents & MCP Series)

Evolution of AI Development Focus

LEAN PROGRAMMING FOR FORMAL SOFTWARE VERIFICATION: Mathematical proof systems and logical frameworks for verified computation

What Aspects of the Harness Remain Unclear

AI-Powered Observability: From Noise to Insight: Transforming How We Monitor, Detect, and Respond

Next Steps for AI System Optimization

The AI Prompt Playbook: Master AI Prompt Engineering with 140 Ready-to-Use Templates for ChatGPT, Claude, Gemini & Copilot

Key Questions

Why is the model only 10% of the system’s behavior?

How can organizations improve their AI systems based on this insight?

Does this mean model development is less important?

What are the risks of focusing too much on harness engineering?

Will this shift change AI development practices industry-wide?

You May Also Like