I Baked a Football Cake and It Taught Me About Building AI Agents

Published: 2 months ago (November 29, 2025 at 09:33 PM EST)

4 min read

Introduction

I recently baked a football cake and it helped me realize AI agents work just like layered desserts. Here’s how flavors, molds and icing map to agentic design.

The code example below is designed to break down user goals into actionable steps and execute them using either custom tools or LLM reasoning. It uses regex to extract numbered steps from the LLM’s plan output, then executes each step by matching keywords like search or compute to the appropriate tool, or falls back to LLM reasoning.

Just like a custom cake has layers of flavor, structure, and decoration, an AI agent has its own stack.
It uses the llama‑3 LLM model (via Ollama) and two simple custom tools:

search_tool() – simulates a search engine returning mock results.
compute_tool() – simulates a computation task returning a placeholder result.

The base model (the “sponge layer”) handles basic reasoning via the LLM.

Ollama LLM Wrapper

import requests
import json

class OllamaLLM:
    def __init__(self, model="llama3"):
        self.model = model

    def __call__(self, prompt: str) -> str:
        """Send a prompt to a local Ollama instance."""
        resp = requests.post(
            "http://localhost:11434/api/generate",
            json={"model": self.model, "prompt": prompt, "stream": False}
        )
        text = json.loads(resp.text).get("response", "")
        return text

Base Agent

class AgentCore:
    def __init__(self, llm):
        self.llm = llm

    def reason(self, prompt):
        return self.llm(prompt)

Local Tools (the icing and décor)

def search_tool(query: str) -> dict:
    return {
        "tool": "search",
        "query": query,
        "results": [
            {"title": "Top NFL QBs 2024", "eff": 98.1},
            {"title": "Quarterback Rankings", "eff": 95.6},
        ],
    }

def compute_tool(task: str) -> dict:
    return {
        "tool": "compute",
        "task": task,
        "result": 42,  # placeholder result
    }

Step‑Parsing Regex

import re

# Detect common LLM step styles:
# 1. Do X
# 1) Do X
# Step 1: Do X
# **Step 1:** Do X
# - Step 1: Do X
# ### Step 1
# Step One:
STEP_REGEX = re.compile(
    r"(?:^|\s)(?:\*\*)?(?:Step\s*)?(\d+)[\.\):\- ]+(.*)", re.IGNORECASE
)

Structured Agent (prompt logic & tool execution)

class StructuredAgent(AgentCore):

    def parse_steps(self, plan: str):
        """Extract step lines starting with numbers."""
        lines = plan.split("\n")
        steps = []
        for line in lines:
            match = STEP_REGEX.search(line.strip())
            if match:
                cleaned = match.group(2).strip()
                steps.append(cleaned)
        return steps

    def execute_step(self, step: str):
        step_lower = step.lower()

        if "search" in step_lower:
            return search_tool(step)

        if "calculate" in step_lower or "compute" in step_lower:
            return compute_tool(step)

        # fallback: let the model reason
        return self.reason(step)

    def run(self, goal: str):
        PLAN_PROMPT = f"""You are a task decomposition engine.  
Your ONLY job is to break the user's goal into a small set of concrete, functional steps.
Your outputs MUST stay within the domain of the user’s goal.  
If the goal references football, metrics, or sports, remain in that domain only.

RULES:
- Only return steps directly needed to complete the user’s goal.
- Do NOT invent topics, examples, reviews, or unrelated domains.
- Do NOT expand into full explanations.
- No marketing language.
- No creative writing.
- No assumptions beyond the user's exact goal.
- No extra commentary.

FORMAT:
1. <short step>
2. <short step>
3. <short step>

User goal: "{goal}"
"""
        plan = self.llm(PLAN_PROMPT)
        steps = self.parse_steps(plan)

        outputs = []
        for step in steps:
            outputs.append({
                "step": step,
                "output": self.execute_step(step)
            })
        return outputs

User‑Facing Agent (formatted output layer)

class FinalAgent(StructuredAgent):
    def respond(self, goal: str):
        results = self.run(goal)

        formatted = "\n".join(
            f"- **{r['step']}** → {r['output']}"
            for r in results
        )

        return (
            f"## Result for goal: *{goal}*\n\n"
            f"{formatted}\n"
        )

Test Cases

if __name__ == "__main__":
    agent = FinalAgent(llm=OllamaLLM("llama3"))

    tests = [
        "Compare NFL quarterback efficiency metrics and summarize insights.",
        "Search for top training drills for youth football players.",
        "Compute a simple metric and explain how you'd structure the process.",
    ]

    for i, t in enumerate(tests, 1):
        print("=" * 70)
        print(f"TEST {i}: {t}")
        print(agent.respond(t))
        print()

Sample Output (first test)

======================================================================
Compare NFL quarterback efficiency metrics and summarize insights.

Gather data on NFL quarterback statistics → Here are some key statistics for NFL quarterbacks, gathered from various sources including Pro-Football-Reference.com, ESPN, and NFL.com:
...
Calculate averages and rankings for each quarterback → {'tool': 'compute', 'task': 'Calculate averages and rankings for each quarterback', 'result': 42}

Whether you’re baking or writing code, structure matters. Think in layers. And if you ever need a sweet analogy to explain AI agents, try cake. Got a dev‑inspired dessert metaphor? Drop it in the comments and let’s make tech tasty.

I Baked a Football Cake and It Taught Me About Building AI Agents

Introduction

Ollama LLM Wrapper

Base Agent

Local Tools (the icing and décor)

Step‑Parsing Regex

Structured Agent (prompt logic & tool execution)

User‑Facing Agent (formatted output layer)

Test Cases

Sample Output (first test)

Related posts

Day 1276 : Career Climbing

Anthropic: AI agents find $4.6M in blockchain smart contract exploits

The Architecture Behind a Stateless AI Application

Losing Confidence