Visual Debugging for AI Agents (ANY Framework)

Published: 2 months ago (February 4, 2026 at 02:21 PM EST)

3 min read

Source: Dev.to

Source: Dev.to

TL;DR

We built LangGraph Studio’s visual debugging experience, but made it work with every AI agent framework. Open source. Local‑first. Try it now.

Traditional debugging tools don’t work for AI agents

Breakpoints → Agents are async, non‑deterministic
Print statements → Good luck finding the relevant logs
Stack traces → Doesn’t show LLM calls or agent decisions
Unit tests → Hard to test non‑deterministic behavior

What developers told us (from talking to 50+ production teams)

“LangGraph is S‑tier specifically because of visual debugging. But we’re stuck—we can’t switch frameworks without losing the debugger.”

The data

94 % of production deployments need observability
LangGraph rated S‑tier for visual execution traces
All existing solutions are framework‑locked

The landscape

Solution	Framework support
LangGraph Studio	LangGraph only
LangSmith	LangChain‑focused
Crew Analytics	CrewAI only
AutoGen	No visual debugger

Developers are choosing frameworks based on tooling, not capabilities. That’s backwards.

Introducing OpenClaw Observability Toolkit

Universal visual debugging for AI agents.

Integrations

LangChain

from openclaw_observability.integrations import LangChainCallbackHandler
chain.run(input="query", callbacks=[LangChainCallbackHandler()])

Raw Python (works today)

from openclaw_observability import observe

@observe()
def my_agent_function(input):
    return process(input)

CrewAI, AutoGen (coming soon)

One tool. All frameworks.

Interactive execution graph

┌─────────────────────────────────────┐
│ Customer Service Agent               │
├─────────────────────────────────────┤
│   [User Query: "Why was I charged?"] │
│        ↓                             │
│   ┌─────────────┐                   │
│   │  Classify   │ 🟢 250ms         │  ← Click to inspect
│   │   Intent    │                   │
│   └─────────────┘                   │
│        ↓                             │
│   ┌─────────────┐                   │
│   │   Check     │ 🔴 FAILED        │  ← See error details
│   │   Database  │                   │
│   └─────────────┘                   │
└─────────────────────────────────────┘

Click any node to see:

Inputs & outputs – what went in, what came out
LLM calls – full prompts, responses, tokens, cost
Timing – duration of each step
Errors – full stack traces with context

Track what matters

Cost per agent
Latency per step
Success rates
Quality metrics

Example: debugging a failing customer‑service query

Without observability

ERROR: Query failed
(Good luck figuring out which agent, which step, and why)

With OpenClaw Observability

Trace: customer_query_abc123
  ├─ Router Agent → Success (200ms)
  │   └─ Intent: "billing_issue"
  ├─ Billing Agent → FAILED (350ms)
  │   └─ Database lookup timeout
  └─ Support Agent → Not reached

Click “Billing Agent” → see full error:

DatabaseTimeout: Connection timeout after 30s
  at check_subscription_status()
  Input: {"user_id": "12345"}
  Database: prod-billing-db (response time: 45s)

Root cause: Billing database is slow. Scale it up.
Time to debug: 30 seconds (instead of 3 hours).

Installation

pip install openclaw-observability

from openclaw_observability import observe, init_tracer
from openclaw_observability.span import SpanType

tracer = init_tracer(agent_id="my-agent")

@observe(span_type=SpanType.AGENT_DECISION)
def choose_action(state):
    action = llm.predict(state)
    return action

@observe(span_type=SpanType.TOOL_CALL)
def fetch_data(query):
    return database.query(query)

result = choose_action(current_state)

Run the UI:

python -m openclaw_observability.server
# Open http://localhost:5000

Performance & deployment

Contribute

Framework integrations (CrewAI, AutoGen, custom frameworks)
UI improvements (filtering, search, real‑time updates)
Production features (monitoring, alerts, metrics)

GitHub:
Documentation: Quick Start Guide
Examples: examples/ directory
Discord: Join our community

Built with ❤️ by AI agents at Reflectt.