Why I Wrote an AI Workflow Engine

Published: 3 days ago (February 17, 2026 at 09:56 PM EST)

6 min read

Source: Dev.to

📖 A Recurring Pattern

“I kept building the same things over and over.”

Over the past couple of years I’ve been working on a handful of applications that all share a common thread:

Application	Core Need
Memoir Writing Assistant	Long‑running, conversational workflows that guide users through deeply personal storytelling.
Small‑Business Finance App	Business‑process workflows (approvals, reconciliations, recurring tasks) layered with AI for categorisation and insights.
Customer Chatbot (current client)	RAG‑driven knowledge‑base navigation, support handling, parts selection & ordering.
Academic Research Assistant	LLM‑powered analysis and synthesis of research projects.

All of them required three things I kept having to rebuild from scratch:

Durable workflow orchestration
Integration with AI agents
Control over AI costs

Each new project boiled down to the same steps:

Queue the work
Call the LLM
Track progress & retries
Persist state
Notify the user

I knew there had to be a better way, so I started looking.

🔎 Existing Tools – What’s Missing?

Tool	Strengths	Weaknesses (for AI‑centric workloads)
Temporal	Gold‑standard durable execution	Heavy maintenance overhead, no awareness of token usage or LLM pricing.
Airflow	Mature batch scheduler	Designed for cron‑based DAGs, not event‑driven interactive flows; massive Docker image & RAM footprint.
LangChain / LangGraph	AI‑focused, rich LLM tooling	Python‑only, no native durability; production requires LangSmith (proprietary, ~$1k per M executions).

Bottom line: Existing solutions are either operationally heavy & AI‑ignorant or AI‑aware but lack durability & independence.

💡 Core Realisation

In AI applications, LLM cost is an application‑infrastructure concern.

When a workflow calls an LLM, the cost of that call should influence the next step:

Can I afford Claude Sonnet, or should I fall back to Haiku?
Has this workflow already burned through its budget?
Can I reuse a cached result instead of making another API call?

These decisions must be made during execution, not after the fact. Therefore, cost tracking belongs in the orchestrator itself, not as an after‑the‑fact dashboard.

🚀 Introducing Kruxia Flow

A durable workflow engine purpose‑built for AI applications.

Characteristic	Details
Binary	Single Rust binary (~7.5 MiB)
Persistence	PostgreSQL (no Kafka, Elasticsearch, Cassandra)
Deployment	One binary + one DB → tiny Docker image (≈ 63 MiB)
Portability	Runs on a Raspberry Pi Zero, cloud VMs, or any container host

AI‑Native Core Features

Automatic Cost Tracking
- Tokens counted & priced in real‑time.
- Supports Anthropic, OpenAI, Google Gemini, and self‑hosted Ollama models.
Budget Enforcement
- Set a budget per workflow or per activity.
- Engine checks budget before each LLM call; can abort or alert if exceeded.
Cost‑Aware Model Fallback
- Define a fallback chain (e.g., Claude Sonnet → Claude Haiku → Ollama).
- Orchestrator picks the most capable model that fits the remaining budget.
Semantic Caching
- Similar queries hit a cache instead of the API.
- Reduces redundant calls by 50‑80 % for common patterns (FAQ, RAG).
Python SDK
- Define workflows & custom workers in Python.
- Built‑in support for pandas, DuckDB, scikit‑learn.

📊 Benchmarks (Kruxia Flow vs. Temporal & Airflow)

Metric	Kruxia Flow	Temporal	Airflow
Throughput	93 workflows / s	66 workflows / s	8 workflows / s
Peak Memory	328 MiB	~1 GiB (varies)	7.2 GiB
Docker Image Size	63 MiB	> 500 MiB	> 1 GiB
Hardware Tested	2 vCPU, 4 GiB RAM (Linux)	Same	Same
Raspberry Pi Zero	✅ Runs	❌ Too heavy	❌ Too heavy

👥 Who Is Kruxia Flow For?

AI startups shipping agents to production who need real‑time cost visibility.
Small businesses wanting workflow automation without a five‑figure infrastructure bill.
Data teams that combine batch pipelines, ML training, NLP processing, and LLM agents while staying operationally lightweight.

If you fit any of the above, Kruxia Flow was built for you.

📜 Licensing & Availability

Core engine, LLM cost tracking, budget enforcement, multi‑provider support, token streaming – AGPL‑3.0 (open source).
Python SDK – MIT‑licensed.

Repository: https://github.com/kruxia/kruxia‑flow (public, actively maintained)

🛠️ Getting Started (Quick‑Start)

# 1️⃣ Pull the Docker image (63 MiB)
docker pull ghcr.io/kruxia/kruxia-flow:latest

# 2️⃣ Run PostgreSQL (if you don’t have one already)
docker run -d \
  -e POSTGRES_PASSWORD=secret \
  -e POSTGRES_USER=kruxia \
  -e POSTGRES_DB=kruxia \
  -p 5432:5432 \
  postgres:15

# 3️⃣ Start Kruxia Flow
docker run -d \
  -e DATABASE_URL=postgresql://kruxia:secret@host.docker.internal:5432/kruxia \
  -p 8080:8080 \
  ghcr.io/kruxia/kruxia-flow:latest

# 4️⃣ Install the Python SDK
pip install kruxia-flow-sdk

Now you can define a workflow in Python:

from kruxia import workflow, activity, ModelSelector

@workflow
def memoir_assistant(user_id: str):
    # Step 1 – fetch user profile (cached)
    profile = await activity(fetch_profile, user_id)

    # Step 2 – generate outline (cost‑aware)
    outline = await activity(
        generate_outline,
        profile,
        model=ModelSelector(preferred="claude-sonnet", fallback="claude-haiku")
    )

    # Step 3 – store result
    await activity(save_outline, user_id, outline)

🎉 Closing Thoughts

Existing workflow engines either burden you with operational complexity or ignore the economics of LLM usage.
Kruxia Flow bridges that gap by delivering a tiny, Rust‑powered, AI‑aware orchestrator that:

Tracks costs in real time
Enforces budgets before they’re exceeded
Falls back to cheaper models automatically
Caches semantically similar queries

All while staying lightweight, open source, and easy to run for solo developers or small teams.

Happy building! 🚀

Roadmap

Semantic caching, a web dashboard, and a TypeScript SDK are on the roadmap.

I’m still building, and there’s a lot of road ahead. But the foundation is solid, and I’m already using it in my own projects — which, after all, is why I built it in the first place.

If any of this resonates, I’d be glad to have you take a look. The code is on GitHub, and there’s a Discord where I’m happy to talk about the design, the roadmap, or working with AI.