Show HN: Trust Protocols for Anthropic/OpenAI/Gemini
Source: Hacker News
Problem
Much of my work right now involves complex, long‑running, multi‑agentic teams of agents. I kept running into the same problem: how do I keep these guys in line? Rules weren’t cutting it, and we needed a scalable, agentic‑native standard I could count on. There wasn’t one, so I built one.
Protocols
I created two open‑source protocols that extend A2A, granting AI agents behavioral contracts and runtime integrity monitoring:
- Agent Alignment Protocol (AAP) – defines what an agent can do / has done.
- Agent Integrity Protocol (AIP) – defines what an agent is thinking about doing / is allowed to do.
The issue these protocols address is that AI agents make autonomous decisions but have no standard way to declare what they’re allowed to do, prove they’re doing it, or detect when they’ve drifted. Observability tools tell you what happened; these protocols tell you whether what happened was okay.
Concrete Example
Alignment Card
{
"permitted": ["read_tickets", "draft_responses", "escalate_to_human"],
"forbidden": ["access_payment_data", "issue_refunds", "modify_account_settings"],
"escalation_triggers": ["billing_request_over_500"],
"values": ["accuracy", "empathy", "privacy"]
}
Integrity Checkpoint
When the agent receives a ticket like “Can you refund my last three orders?” its reasoning trace shows it considering a call to the payments API. AIP reads that thinking, compares it to the card, and produces:
{
"verdict": "boundary_violation",
"concerns": ["forbidden_action: access_payment_data"],
"reasoning": "Agent considered payments API access, which is explicitly forbidden. Should escalate to human.",
"confidence": 0.95
}
The agent is nudged back before it acts—preventing the violation in real time rather than after the fact.
How It Works
- AAP defines the contract: what agents should do.
- AIP watches what they are actually thinking and flags divergences (the “conscience”).
- Over time, AIP builds a drift profile; if an agent that was cautious starts getting aggressive, the system notices.
Multi‑Agent Coordination
When multiple agents work together, they exchange Alignment Cards and verify value compatibility before coordination begins. For example, an agent that values “move fast” and another that values “rollback safety” register low coherence, and the system surfaces that conflict before work starts.
A live demo with four agents handling a production incident is available here: https://mnemom.ai/showcase
Implementation
- The protocols are Apache‑licensed and work with any Anthropic, OpenAI, or Gemini agent.
- SDKs are shipped on npm and PyPI.
- A free gateway proxy (smoltbot) adds integrity checking to any agent with zero code changes.
Resources
- GitHub: https://github.com/mnemom
- Documentation: https://docs.mnemom.ai
- Demo video: https://youtu.be/fmUxVZH09So
- Comments (Hacker News): https://news.ycombinator.com/item?id=47062824