Show HN: Open-Source SDK for AI Knowledge Work

Published: (February 10, 2026 at 12:06 PM EST)
3 min read

Source: Hacker News

Overview

GitHub:

Most AI agent frameworks target code: write code, run tests, fix errors, repeat. This works because code has a natural verification signal—it either works or it doesn’t.

The Knowledge Work SDK treats knowledge work like an engineering problem:

Task → Brief → Rubric (hidden from executor) → Work → Verify → Fail? → Retry → Pass → Submit

The orchestrator coordinates sub‑agents, web search, code execution, and file I/O, then checks its own work against criteria it can’t game (the rubric is generated in a separate call and the executor never sees it directly).

The SDK was originally built as a harness for RL training on knowledge tasks. The rubric serves as the reward function, providing a structured reward signal for tasks that normally lack one.


What Makes Knowledge Work Different from Code?

The SDK adds functionality that many current agents lack for knowledge work:

Explore Mode

  • Maps the solution space, identifies set‑level gaps, and presents multiple options.
  • Generates N distinct approaches, each with explicit assumptions and counterfactuals (e.g., “works if X, breaks if Y”).
  • Ends with a summary of set‑level gaps—what angles the entire set missed.
  • Useful for strategy, design, and creative problems where trade‑offs matter.

See the example repository for a sense of how this differs.

Checkpointing

  • Allows you to pause a multi‑agent workflow, inspect where it went wrong, and resume or fork from a specific stage.
  • Helpful for rollouts, multiple explorations after a search phase, or re‑running a particular segment.

Verification Loop

The verification step provides the main leverage:

  • A model that can accurately assess its own work against a rubric is more valuable than one that merely produces better first drafts.
  • The rubric makes quality legible to the agent, to humans, and potentially to a training signal.

Key Features

  • Remote Execution Environments: Works with Docker, e2b, local environments, browsers as sandboxes, etc. The model executes commands in your context and iterates based on the feedback loop. Code execution is treated as a protocol.
  • Tool Calling: Models can write terminal code and iterate based on feedback. You can pass functions or documentation in the context, and the model will generate and execute the necessary code (similar to Anthropic’s programmatic tool calling). Details:

Guides & Examples

  • SDK Guide:
  • Extensible Mode (custom mode example):
  • Working with Files:
  • CSV Example:
  • Remote Execution Example:

License

MIT licensed. Feedback is welcome.


Comments URL:
Points: 4
Comments: 1

0 views
Back to Blog

Related posts

Read more »

Asimov (YC W26) Is Hiring

About the Project We're building training data for humanoid robots by collecting egocentric video of people doing everyday tasks. The Role Wear a phone mounted...

MMAcevedo aka Lena by qntm

Article URL: https://qntm.org/mmacevedo Comments URL: https://news.ycombinator.com/item?id=46999224 Points: 3 Comments: 0...