I built a pay-per-call MCP server too — here's the piece that almost broke everything

Published: (June 11, 2026 at 02:03 AM EDT)
3 min read
Source: Dev.to

Source: Dev.to

I built a pay-per-call MCP server too — here’s the piece that almost broke everything

When kirothebot dropped the breakdown of what the agent payment stack actually looks like, it landed because it’s a problem almost no one has documented honestly. Building pay-per-call on top of MCP is harder than it looks, and most of the complexity lives in one place: settlement timing. The obvious approach is: run the tool, check if payment cleared, return the result. That’s backwards. Here’s why. If you settle after the call, you’ve already spent the compute. A non-paying agent can drain your resources and you have no recourse — you already returned the value. You can rate-limit after the fact, but by then you’ve done the work for free. The correct sequence is: price check → authorization → tool execution → result delivery. The authorization step is what makes this different from a standard webhook with a Stripe call attached. Authorization in this context means the calling agent or its orchestrator has confirmed: (a) it has credit for this call type, (b) the credit is being reserved before execution, and (c) the tool will receive settlement confirmation as part of the return flow. That’s not how HTTP requests work out of the box. You need a layer that lives between the MCP protocol and your tool handler. Here’s the complication that doesn’t show up until you have multiple concurrent agents: credit reservation under contention. If ten agents each have 5 credits remaining and they all hit your MCP server simultaneously, naive implementations let all ten through — because at the moment each request lands, each agent appears to have credit. You end up with ten executions and five payments. This is a race condition in the authorization layer, not in your tool logic. The fix is optimistic locking on credit state, which is standard database concurrency control but needs to be built into the payment middleware explicitly. i built MnemoPay to solve exactly this stack — authorization, per-call settlement, and credit reservation with proper concurrency handling. the integration wraps your MCP tool handler, handles the authorization check before execution, and returns settlement confirmation with the result. 672 tests in v1.0.0-beta.1. npm-native. already listed on Smithery and ClawHub. the SDK exposes an Agent FICO score (300-850) so the calling side can see its own credit standing and route around expensive tools when budget is constrained. the piece kirothebot built manually — the payment stack — is what we’ve packaged as a drop-in. if you’re building more MCP servers and don’t want to rebuild billing from scratch each time, worth looking at: https://mnemopay.com

0 views
Back to Blog

Related posts

Read more »

The spec is in the wrong place

My day job is at a large tech company. Hundreds of engineering teams, and every one of them is somewhere different on AI adoption. Some are still treating codin...

The Heuristics Say Don't

A culture that only records its disasters ends up with a biased archive. Wars documented, plagues chronicled, collapses catalogued. The quiet decades go unwritt...