Agentic AI: Schema-Validated Tool Execution and Deterministic Caching

Published: 3 hours ago (January 2, 2026 at 01:59 AM EST)

5 min read

Source: Dev.to

Overview

Agentic AI systems do not fail because models cannot reason. They fail because tool execution is unmanaged.

Once agents are allowed to plan, retry, self‑criticize, or collaborate, tool calls multiply rapidly. Without strict controls, this leads to infrastructure failures, unpredictable cost growth, and non‑deterministic behavior.

This article explains how to engineer the tool‑execution layer of an agentic AI system using two explicit and independent mechanisms:

Contract‑driven tool execution
Deterministic tool‑result caching

Each mechanism solves a different class of production failures and must be implemented separately.

Real Production Scenario

Context

You are building an Incident Analysis Agent for SRE teams.

What the agent does

Fetch logs for a service
Analyze error patterns
Re‑fetch logs if confidence is low
Allow a second agent (critic) to validate findings

Tool characteristics

Tool name: fetch_service_logs

Backend: Elasticsearch / Loki / Splunk

Latency: 300–800 ms

Rate‑limited
Expensive per execution

This is a common real‑world agent workload.

Part I: Contract‑Driven Tool Execution in Agentic AI Systems

The problem without contracts

When LLMs emit tool arguments directly, the runtime receives inputs like:

{"service": "auth", "window": "24 hours"}
{"service": "Auth Service", "window": "yesterday"}
{"service": ["auth"], "window": 24}
{"service": "", "window": "24h"}

Why this happens

LLMs reason in natural language
LLMs paraphrase arguments
LLMs are not type‑safe systems

What breaks in production

Invalid Elasticsearch queries
Full index scans
Query‑builder crashes
Silent data corruption
Retry loops amplify failures

Relying on the model to always produce valid input is not system design.

What contract‑driven tool execution means

Contract‑driven execution means:

The runtime owns the tool interface
The model must conform to that interface
Invalid input never reaches infrastructure

This is the same boundary enforcement used in production APIs.

Step 1: Define a strict tool contract

from pydantic import BaseModel, Field, field_validator
import re
from typing import List

class FetchServiceLogsInput(BaseModel):
    service: str = Field(
        ...,
        description="Kubernetes service name, lowercase, no spaces"
    )
    window: str = Field(
        ...,
        description="Time window format: 5m, 1h, 24h"
    )

    @field_validator("service")
    @classmethod
    def validate_service(cls, value: str) -> str:
        if not value:
            raise ValueError("service cannot be empty")
        if not re.fullmatch(r"[a-z0-9\-]+", value):
            raise ValueError("service must be lowercase alphanumeric with dashes")
        return value

    @field_validator("window")
    @classmethod
    def validate_window(cls, value: str) -> str:
        if not re.fullmatch(r"\d+(m|h)", value):
            raise ValueError("window must be like 5m, 1h, 24h")
        return value

class FetchServiceLogsOutput(BaseModel):
    logs: List[str]

What these validations prevent

Invalid input	Prevented issue
Empty service	Full log scan
Mixed case or spaces	Query mismatch
Natural‑language time	Ambiguous queries
Lists or numbers	Query‑builder crashes

Nothing reaches infrastructure unless it passes this gate.

Step 2: Implement the actual tool

def fetch_service_logs(service: str, window: str) -> list[str]:
    print(f"QUERY logs for service={service}, window={window}")
    return [
        f"[ERROR] timeout detected in {service}",
        f"[WARN] retry triggered in {service}",
    ]

Step 3: Runtime‑owned tool registry

TOOLS = {
    "fetch_service_logs": {
        "version": "v1",
        "input_model": FetchServiceLogsInput,
        "output_model": FetchServiceLogsOutput,
        "handler": fetch_service_logs,
        "cache_ttl": 3600,   # seconds
    }
}

The agent cannot invent tools, bypass schemas, or change versions.

Step 4: Contract‑driven execution boundary

def execute_tool_contract(tool_name: str, raw_args: dict):
    tool = TOOLS[tool_name]

    # Validate input against the contract
    args = tool["input_model"](**raw_args)

    # Call the handler with a clean dict
    raw_result = tool["handler"](**args.model_dump())

    # Wrap the result in the output model
    return tool["output_model"](logs=raw_result)

Execution flow for contract enforcement

Agent emits tool call
        ↓
Raw arguments (untrusted)
        ↓
Schema validation
   ┌───────────────┐
   │ Invalid       │ → reject and re‑plan
   └───────────────┘
          ↓
       Valid
          ↓
Tool executes
          ↓
Infrastructure queried safely

Part II: Deterministic Caching in Agentic AI Systems

The problem after contracts are added

Even with perfect validation, agents repeat work:

execute_tool_contract(
    "fetch_service_logs",
    {"service": "auth-service", "window": "24h"}
)

execute_tool_contract(
    "fetch_service_logs",
    {"window": "24h", "service": "auth-service"}
)

Same intent, same backend, executed twice.

Why naive caching fails

{"service": "auth-service", "window": "24h"}
{"window": "24h", "service": "auth-service"}

Different strings → different cache keys, even though they are semantically identical.

Agentic systems require semantic equivalence, not raw string equality.

Infrastructure required for deterministic caching

Canonicalisation – Convert incoming arguments to a deterministic, ordered representation (e.g., sorted JSON).
Hash‑based cache key – Compute a stable hash (SHA‑256) of the canonicalised payload together with the tool version.
Result storage – Persist the output model (or a serialized form) together with the hash and a TTL.
Cache lookup wrapper – Before invoking the handler, check the cache; on a hit, return the stored result; on a miss, execute and store.

A minimal implementation sketch:

import json, hashlib, time
from collections import defaultdict

# Simple in‑memory cache for illustration
_CACHE = defaultdict(dict)   # {tool_name: {hash: (timestamp, result)}}

def _canonicalise(args: dict) -> str:
    """Return a deterministic JSON string with sorted keys."""
    return json.dumps(args, sort_keys=True, separators=(",", ":"))

def _hash_payload(tool_name: str, payload: str) -> str:
    return hashlib.sha256(f"{tool_name}:{payload}".encode()).hexdigest()

def execute_with_cache(tool_name: str, raw_args: dict):
    tool = TOOLS[tool_name]

    # 1️⃣ Validate input
    args = tool["input_model"](**raw_args)

    # 2️⃣ Canonicalise & hash
    payload = _canonicalise(args.model_dump())
    key = _hash_payload(tool_name, payload)

    # 3️⃣ Cache lookup
    entry = _CACHE[tool_name].get(key)
    if entry:
        ts, cached_result = entry
        # (Cache hit logic would go here)
        return cached_result

    # 4️⃣ Execute and store
    raw_result = tool["handler"](**args.model_dump())
    validated = tool["output_model"](logs=raw_result)
    _CACHE[tool_name][key] = (time.time(), validated)
    return validated

Example canonical form

fetch_service_logs|auth-service|24h|v1

Step 2: Cache setup (Redis example)

import redis
import hashlib
import json

redis_client = redis.Redis(host="localhost", port=6379)

def cache_key(canonical: str) -> str:
    return hashlib.sha256(canonical.encode()).hexdigest()

Step 3: Cached tool execution

def execute_tool_cached(tool_name: str, raw_args: dict):
    tool = TOOLS[tool_name]

    args = tool["input_model"](**raw_args)

    canonical = json.dumps(
        {
            "tool": tool_name,
            "version": tool["version"],
            "args": args.model_dump(),
        },
        sort_keys=True,
        separators=(",", ":")
    )
    key = cache_key(canonical)

    cached = redis_client.get(key)
    if cached:
        print("CACHE HIT — skipping infra call")
        return tool["output_model"](**json.loads(cached))

    print("CACHE MISS — executing tool")
    raw_result = tool["handler"](**args.model_dump())
    validated = tool["output_model"](logs=raw_result)

    redis_client.setex(
        key,
        tool["cache_ttl"],
        validated.model_dump_json()
    )
    return validated

Execution flow for deterministic caching

Validated tool request
        ↓
Canonicalization
        ↓
Hash generation
        ↓
Redis lookup
   ┌───────────────┐
   │ Cache HIT     │ → return cached result
   └───────────────┘
          ↓
       Cache MISS
          ↓
Execute expensive tool
          ↓
Validate output
          ↓
Store result with TTL
          ↓
Return result

Separation of responsibilities

Problem	Solved by
Invalid input	Contract‑driven execution
Infrastructure crashes	Contract‑driven execution
Duplicate execution	Deterministic caching
Cost explosion	Deterministic caching

Final Takeaway

Agentic AI systems become production‑ready when tool execution is engineered like backend infrastructure, not treated as an LLM side effect.

Contracts make execution safe.
Caching makes execution scalable.

Skipping either guarantees failure.

Agentic AI: Schema-Validated Tool Execution and Deterministic Caching

Overview

Real Production Scenario

Context

What the agent does

Tool characteristics

Part I: Contract‑Driven Tool Execution in Agentic AI Systems

The problem without contracts

Why this happens

What breaks in production

What contract‑driven tool execution means

Step 1: Define a strict tool contract

What these validations prevent

Step 2: Implement the actual tool

Step 3: Runtime‑owned tool registry

Step 4: Contract‑driven execution boundary

Execution flow for contract enforcement

Part II: Deterministic Caching in Agentic AI Systems

The problem after contracts are added

Why naive caching fails

Infrastructure required for deterministic caching

Step 2: Cache setup (Redis example)

Step 3: Cached tool execution

Execution flow for deterministic caching

Separation of responsibilities

Final Takeaway

Related posts

What Broke When I Let AI Handle My Code Reviews (And How I Fixed It)

Krak App Referral Code (@GET10) Earn 10$ Instantly

Equivalence of distance-based and RKHS-based statistics in hypothesis testing

Threat Intelligence Automation with AI/ML

Overview

Real Production Scenario

Context

What the agent does

Tool characteristics

Part I: Contract‑Driven Tool Execution in Agentic AI Systems

The problem without contracts

Why this happens

What breaks in production

What contract‑driven tool execution means

Step 1: Define a strict tool contract

What these validations prevent

Step 2: Implement the actual tool

Step 3: Runtime‑owned tool registry

Step 4: Contract‑driven execution boundary

Execution flow for contract enforcement

Part II: Deterministic Caching in Agentic AI Systems

The problem after contracts are added

Why naive caching fails

Infrastructure required for deterministic caching

Step 2: Cache setup (Redis example)

Step 3: Cached tool execution

Execution flow for deterministic caching

Separation of responsibilities

Final Takeaway

Related posts

What Broke When I Let AI Handle My Code Reviews (And How I Fixed It)

Krak App Referral Code (@GET10) Earn 10$ Instantly

Equivalence of distance-based and RKHS-based statistics in hypothesis testing

Threat Intelligence Automation with AI/ML

Step 1: Define a strict tool contract

Step 2: Implement the actual tool

Step 3: Runtime‑owned tool registry

Step 4: Contract‑driven execution boundary

Part II: Deterministic Caching in Agentic AI Systems

Step 2: Cache setup (Redis example)

Step 3: Cached tool execution