LangChain's CEO argues that better models alone won't get your AI agent to production

Published: 1 month ago (March 6, 2026 at 05:00 PM EST)

4 min read

Source: VentureBeat

Harness engineering

As models get smarter and more capable, the “harnesses” around them must also evolve. This “harness engineering” is an extension of context engineering, says LangChain co‑founder and CEO Harrison Chase in a new VentureBeat Beyond the Pilot podcast episode.

Traditional AI harnesses have tended to constrain models from running in loops and calling tools. Harnesses built specifically for AI agents, however, allow them to interact more independently and perform long‑running tasks more effectively.

Chase also weighed in on OpenAI’s acquisition of OpenClaw, arguing that its viral success came down to a willingness to “let it rip” in ways that no major lab would—and questioning whether the acquisition actually gets OpenAI closer to a safe enterprise version of the product.

“The trend in harnesses is to actually give the large language model (LLM) itself more control over context engineering, letting it decide what it sees and what it doesn’t see. Now, this idea of a long‑running, more autonomous assistant is viable.” – Harrison Chase

Tracking progress and maintaining coherence

While the concept of allowing LLMs to run in a loop and call tools seems relatively simple, it’s difficult to pull off reliably, Chase noted. For a while, models were “below the threshold of usefulness” and simply couldn’t run in a loop, so developers used graphs and wrote chains to get around that.

Chase pointed to AutoGPT—once the fastest‑growing GitHub project ever—as a cautionary example: same architecture as today’s top agents, but the models weren’t good enough yet to run reliably in a loop, so it faded fast.

As LLMs keep improving, teams can construct environments where models can run in loops and plan over longer horizons, continually improving these harnesses. Previously, “you couldn’t really make improvements to the harness because you couldn’t actually run the model in a harness,” Chase said.

Deep Agents

LangChain’s answer to this challenge is Deep Agents, a customizable general‑purpose harness built on LangChain and LangGraph. Its features include:

Planning capabilities
A virtual filesystem
Context and token management
Code execution
Skills and memory functions

Deep Agents can delegate tasks to subagents that are specialized with different tools and configurations and can work in parallel. Context is isolated, meaning subagent work doesn’t clutter the main agent’s context, and large subtask context is compressed into a single result for token efficiency.

All agents have access to file systems and can essentially create to‑do lists that they execute and track over time.

“When it goes on to the next step, and it goes on to step two or step three or step four out of a 200‑step process, it has a way to track its progress and keep that coherence. It comes down to letting the LLM write its thoughts down as it goes along, essentially.” – Chase

Chase emphasized that harnesses should be designed so that models can maintain coherence over longer tasks and be “amenable” to models deciding when to compact context at points it determines is “advantageous.”

Giving agents access to code interpreters and BASH tools increases flexibility. Providing agents with skills—instead of loading every tool up front—allows them to load information only when needed:

“So rather than hard‑code everything into one big system prompt, you could have a smaller system prompt: ‘This is the core foundation, but if I need to do X, let me read the skill for X. If I need to do Y, let me read the skill for Y.’” – Chase

Essentially, context engineering is a “really fancy” way of asking: What is the LLM seeing? Because that’s different from what developers see. When human developers can analyze agent traces, they can put themselves in the AI’s “mindset” and answer questions such as:

What is the system prompt?
How is it created?
Is it static or populated dynamically?
What tools does the agent have?
When it makes a tool call and gets a response back, how is that presented?

“When agents mess up, they mess up because they don’t have the right context; when they succeed, they succeed because they have the right context. I think of context engineering as bringing the right information in the right format to the LLM at the right time.” – Chase

Podcast highlights

How LangChain built its stack: LangGraph as the core pillar, LangChain at the center, Deep Agents on top.
Why code sandboxes will be the next big thing.
How a different type of UX will evolve as agents run at longer intervals (or continuously).
Why traces and observability are core to building an agent that actually works.

You can listen and subscribe to Beyond the Pilot on Spotify, Apple Podcasts, or wherever you get your podcasts.

LangChain's CEO argues that better models alone won't get your AI agent to production

Harness engineering

Tracking progress and maintaining coherence

Deep Agents

Podcast highlights

Related posts

The Agent Scope Creep Problem: Why AI Agents That Grow Without Limits Become Unreliable

Giving AI agents Ethereum wallets and the ability to sign transactions

The Agentic Web: When AI Starts Talking to Other AI

Anthropic rolls out Code Review for Claude Code as it sues over Pentagon blacklist and partners with Microsoft