Reducing Sentry APM Costs in FastAPI by Sending Only What Matters
Source: Dev.to
The Real Problem with Default APM
By default, Sentry APM is very generous:
- Every request becomes a transaction
- Even sub‑second successful calls are recorded
- Docs and schema endpoints are also traced
For high‑traffic APIs, this quickly turns into:
- Large transaction volume
- Faster quota exhaustion
- Paying for noise instead of insight
In reality, I only needed visibility into:
- Requests that fail (5xx)
- Requests that are slow
- Anything abnormal or risky
Everything else was just background noise.
The Cost‑Saving Strategy
Always Send to Sentry
- Any request returning 5xx
- Any request taking more than 5 seconds
Drop from Sentry
- Fast
GET / POST / PUTrequests - Successful requests completing under 3 seconds
/docsand/openapi.jsonendpoints
This keeps Sentry focused on problems, not traffic volume.
Why Two Middlewares Are Required
SentryAsgiMiddleware – Enables APM
SentryAsgiMiddleware is what actually:
- Starts and finishes Sentry transactions
- Hooks into the ASGI request lifecycle
- Sends performance data to Sentry
Without this middleware:
- No transactions are created
before_send_transactionis never called- APM simply does not work
In short: No SentryAsgiMiddleware = No APM
TimingMiddleware – Adds Intelligence
The second middleware is custom. It measures the real execution time of each request and attaches it to the Sentry scope.
class TimingMiddleware(BaseHTTPMiddleware):
async def dispatch(self, request: Request, call_next):
start_time = time.time()
response = await call_next(request)
duration = time.time() - start_time
with sentry_sdk.configure_scope() as scope:
scope.set_extra("duration", duration)
return response
Why this is needed:
- Execution time is required to decide whether a request is “important”
- Sentry’s internal timing isn’t easily usable for filtering
- Without this, cost‑control logic becomes guesswork
Think of it this way:
SentryAsgiMiddlewareis the pipelineTimingMiddlewareis the brain
Filtering Transactions Before They Are Sent
Sentry provides a hook called before_send_transaction. This runs just before a transaction is sent to Sentry and allows you to drop it.
def before_send_transaction(event, hint):
transaction_name = event.get("transaction", "")
request_method = event.get("request", {}).get("method", "")
status_code = event.get("contexts", {}).get("response", {}).get("status_code", 0)
duration = event.get("extra", {}).get("duration")
# Ignore docs and schema
if "/docs" in transaction_name or "/openapi.json" in transaction_name:
return None
# Always send server errors
if status_code >= 500:
return event
# Drop fast successful requests
if (
request_method in ["GET", "POST", "PUT"]
and 200 <= status_code < 400
and duration
and duration < 3
):
return None
return event
- Returning
event→ transaction is sent - Returning
None→ transaction is dropped
Simple, predictable, and fully under your control.
Initializing Sentry with Custom Filtering
sentry_sdk.init(
dsn="SENTRY_DSN",
send_default_pii=True,
traces_sample_rate=1.0,
before_send_transaction=before_send_transaction,
)
Instead of relying on random sampling, this approach gives deterministic filtering based on real behavior.
What Changed After This
Lower Cost
Transaction volume dropped sharply, and Sentry usage slowed down immediately.
Cleaner Dashboards
Only slow or failing requests appeared, making debugging easier.
Better Signal
Every transaction in Sentry now means “This is worth looking at.”
When This Approach Makes Sense
- Your API traffic is high
- Most requests are successful and fast
- You care more about issues than raw metrics
If you want every request traced forever, this is not the right approach. If you want useful observability without burning money, it absolutely is.
Final Thoughts
APM should help you find problems, not create new ones in your billing dashboard.
By combining:
SentryAsgiMiddleware- A simple timing middleware
before_send_transaction
you turn Sentry from “collect everything” into “collect what actually matters.” That small change makes a huge difference in real production systems.