Why I Wrote an AI Workflow Engine

Published: (February 17, 2026 at 09:56 PM EST)
6 min read
Source: Dev.to

Source: Dev.to

📖 A Recurring Pattern

“I kept building the same things over and over.”

Over the past couple of years I’ve been working on a handful of applications that all share a common thread:

ApplicationCore Need
Memoir Writing AssistantLong‑running, conversational workflows that guide users through deeply personal storytelling.
Small‑Business Finance AppBusiness‑process workflows (approvals, reconciliations, recurring tasks) layered with AI for categorisation and insights.
Customer Chatbot (current client)RAG‑driven knowledge‑base navigation, support handling, parts selection & ordering.
Academic Research AssistantLLM‑powered analysis and synthesis of research projects.

All of them required three things I kept having to rebuild from scratch:

  1. Durable workflow orchestration
  2. Integration with AI agents
  3. Control over AI costs

Each new project boiled down to the same steps:

  • Queue the work
  • Call the LLM
  • Track progress & retries
  • Persist state
  • Notify the user

I knew there had to be a better way, so I started looking.

🔎 Existing Tools – What’s Missing?

ToolStrengthsWeaknesses (for AI‑centric workloads)
TemporalGold‑standard durable executionHeavy maintenance overhead, no awareness of token usage or LLM pricing.
AirflowMature batch schedulerDesigned for cron‑based DAGs, not event‑driven interactive flows; massive Docker image & RAM footprint.
LangChain / LangGraphAI‑focused, rich LLM toolingPython‑only, no native durability; production requires LangSmith (proprietary, ~$1k per M executions).

Bottom line: Existing solutions are either operationally heavy & AI‑ignorant or AI‑aware but lack durability & independence.

💡 Core Realisation

In AI applications, LLM cost is an application‑infrastructure concern.

When a workflow calls an LLM, the cost of that call should influence the next step:

  • Can I afford Claude Sonnet, or should I fall back to Haiku?
  • Has this workflow already burned through its budget?
  • Can I reuse a cached result instead of making another API call?

These decisions must be made during execution, not after the fact. Therefore, cost tracking belongs in the orchestrator itself, not as an after‑the‑fact dashboard.

🚀 Introducing Kruxia Flow

A durable workflow engine purpose‑built for AI applications.

CharacteristicDetails
BinarySingle Rust binary (~7.5 MiB)
PersistencePostgreSQL (no Kafka, Elasticsearch, Cassandra)
DeploymentOne binary + one DB → tiny Docker image (≈ 63 MiB)
PortabilityRuns on a Raspberry Pi Zero, cloud VMs, or any container host

AI‑Native Core Features

  1. Automatic Cost Tracking

    • Tokens counted & priced in real‑time.
    • Supports Anthropic, OpenAI, Google Gemini, and self‑hosted Ollama models.
  2. Budget Enforcement

    • Set a budget per workflow or per activity.
    • Engine checks budget before each LLM call; can abort or alert if exceeded.
  3. Cost‑Aware Model Fallback

    • Define a fallback chain (e.g., Claude Sonnet → Claude Haiku → Ollama).
    • Orchestrator picks the most capable model that fits the remaining budget.
  4. Semantic Caching

    • Similar queries hit a cache instead of the API.
    • Reduces redundant calls by 50‑80 % for common patterns (FAQ, RAG).
  5. Python SDK

    • Define workflows & custom workers in Python.
    • Built‑in support for pandas, DuckDB, scikit‑learn.

📊 Benchmarks (Kruxia Flow vs. Temporal & Airflow)

MetricKruxia FlowTemporalAirflow
Throughput93 workflows / s66 workflows / s8 workflows / s
Peak Memory328 MiB~1 GiB (varies)7.2 GiB
Docker Image Size63 MiB> 500 MiB> 1 GiB
Hardware Tested2 vCPU, 4 GiB RAM (Linux)SameSame
Raspberry Pi Zero✅ Runs❌ Too heavy❌ Too heavy

👥 Who Is Kruxia Flow For?

  • AI startups shipping agents to production who need real‑time cost visibility.
  • Small businesses wanting workflow automation without a five‑figure infrastructure bill.
  • Data teams that combine batch pipelines, ML training, NLP processing, and LLM agents while staying operationally lightweight.

If you fit any of the above, Kruxia Flow was built for you.

📜 Licensing & Availability

  • Core engine, LLM cost tracking, budget enforcement, multi‑provider support, token streamingAGPL‑3.0 (open source).
  • Python SDKMIT‑licensed.

Repository: https://github.com/kruxia/kruxia‑flow (public, actively maintained)

🛠️ Getting Started (Quick‑Start)

# 1️⃣ Pull the Docker image (63 MiB)
docker pull ghcr.io/kruxia/kruxia-flow:latest

# 2️⃣ Run PostgreSQL (if you don’t have one already)
docker run -d \
  -e POSTGRES_PASSWORD=secret \
  -e POSTGRES_USER=kruxia \
  -e POSTGRES_DB=kruxia \
  -p 5432:5432 \
  postgres:15

# 3️⃣ Start Kruxia Flow
docker run -d \
  -e DATABASE_URL=postgresql://kruxia:secret@host.docker.internal:5432/kruxia \
  -p 8080:8080 \
  ghcr.io/kruxia/kruxia-flow:latest

# 4️⃣ Install the Python SDK
pip install kruxia-flow-sdk

Now you can define a workflow in Python:

from kruxia import workflow, activity, ModelSelector

@workflow
def memoir_assistant(user_id: str):
    # Step 1 – fetch user profile (cached)
    profile = await activity(fetch_profile, user_id)

    # Step 2 – generate outline (cost‑aware)
    outline = await activity(
        generate_outline,
        profile,
        model=ModelSelector(preferred="claude-sonnet", fallback="claude-haiku")
    )

    # Step 3 – store result
    await activity(save_outline, user_id, outline)

🎉 Closing Thoughts

Existing workflow engines either burden you with operational complexity or ignore the economics of LLM usage.
Kruxia Flow bridges that gap by delivering a tiny, Rust‑powered, AI‑aware orchestrator that:

  • Tracks costs in real time
  • Enforces budgets before they’re exceeded
  • Falls back to cheaper models automatically
  • Caches semantically similar queries

All while staying lightweight, open source, and easy to run for solo developers or small teams.

Happy building! 🚀

Roadmap

  • Semantic caching, a web dashboard, and a TypeScript SDK are on the roadmap.

I’m still building, and there’s a lot of road ahead. But the foundation is solid, and I’m already using it in my own projects — which, after all, is why I built it in the first place.

If any of this resonates, I’d be glad to have you take a look. The code is on GitHub, and there’s a Discord where I’m happy to talk about the design, the roadmap, or working with AI.

0 views
Back to Blog

Related posts

Read more »

OpenClaw Is Unsafe By Design

OpenClaw Is Unsafe By Design The Cline Supply‑Chain Attack Feb 17 A popular VS Code extension, Cline, was compromised. The attack chain illustrates several AI‑...