FastAPI Performance: The Hidden Thread Pool Overhead You Might Be Missing

Published: (December 5, 2025 at 01:48 PM EST)
4 min read
Source: Dev.to

Source: Dev.to

Understanding the Problem

FastAPI is an incredible framework for building high‑performance APIs in Python. Its async capabilities, automatic validation, and excellent documentation make it a joy to work with. However, a subtle performance issue is often overlooked: unnecessary thread‑pool delegation for synchronous dependencies.

How FastAPI Handles Dependencies

FastAPI distinguishes between async and sync callables:

  • async def functions – executed directly in the event loop.
  • def functions – sent to a thread pool via anyio.to_thread.run_sync.

This behavior applies to both path‑operation functions and dependencies. Internally FastAPI performs a simplified check:

import asyncio
from anyio import to_thread

# Simplified FastAPI logic
if asyncio.iscoroutinefunction(dependency):
    # Run directly in event loop
    result = await dependency()
else:
    # Send to thread pool
    result = await to_thread.run_sync(dependency)

Because class constructors (__init__) are always synchronous, class‑based dependencies are always routed to the thread pool.

The Thread Pool Overhead

  • Default thread pool size: 40 threads.
  • Each thread‑pool execution incurs context‑switching, thread synchronization, and possible queuing when all threads are busy.

Example: Multiple Class‑Based Dependencies

from fastapi import Depends, FastAPI

app = FastAPI()

class QueryParams:
    def __init__(self, q: str | None = None, skip: int = 0, limit: int = 100):
        self.q = q
        self.skip = skip
        self.limit = limit

@app.get("/items/")
async def read_items(params: QueryParams = Depends()):
    return {"q": params.q, "skip": params.skip, "limit": params.limit}

Each request creates a QueryParams instance in the thread pool, even though it only performs simple assignments.

If an endpoint has several such dependencies, the overhead multiplies:

@app.get("/complex-endpoint/")
async def complex_operation(
    auth: AuthParams = Depends(),
    query: QueryParams = Depends(),
    pagination: PaginationParams = Depends(),
    filters: FilterParams = Depends(),
):
    pass  # Four dependencies → four thread‑pool tasks

With 100 concurrent requests, 400 thread‑pool tasks are queued, but only 40 can run simultaneously, causing latency spikes.

Real‑World Impact

  • API with 50 endpoints
  • Average 3 class‑based dependencies per endpoint
  • 1 000 requests per second

→ ~150 000 unnecessary thread‑pool operations per second. Even if each operation is fast, the cumulative overhead can become a bottleneck.

The Solution: fastapi-async-safe-dependencies

A lightweight library that marks certain dependencies as safe to run directly in the event loop, bypassing the thread pool.

Installation

pip install fastapi-async-safe-dependencies

Basic Usage

from fastapi import Depends, FastAPI
from fastapi_async_safe import async_safe, init_app

app = FastAPI()
init_app(app)  # Initialize the library

@async_safe  # Mark as safe for async context
class QueryParams:
    def __init__(self, q: str | None = None, skip: int = 0, limit: int = 100):
        self.q = q
        self.skip = skip
        self.limit = limit

@app.get("/items/")
async def read_items(params: QueryParams = Depends()):
    return {"q": params.q, "skip": params.skip, "limit": params.limit}

What changed?

  1. Call init_app(app) at startup.
  2. Decorate dependency classes with @async_safe.

How It Works Under the Hood

When a class is decorated with @async_safe, the library creates an async wrapper:

# Simplified wrapper generated by @async_safe
async def _wrapper(**kwargs):
    return YourClass(**kwargs)  # Direct constructor call

Because the wrapper is a coroutine, asyncio.iscoroutinefunction returns True, so FastAPI runs it directly in the event loop—no thread‑pool involvement.

init_app() walks through all routes and dependencies, replacing class references with these wrappers. The wrapper itself performs no await; it simply executes the synchronous constructor instantly, which is safe when the constructor is non‑blocking.

Supporting Inheritance

from fastapi_async_safe import async_safe

@async_safe
class BaseParams:
    def __init__(self, limit: int = 100):
        self.limit = min(limit, 1000)

class QueryParams(BaseParams):
    def __init__(self, q: str | None = None, **kwargs):
        super().__init__(**kwargs)
        self.q = q

If a subclass does need thread‑pool execution (e.g., performs I/O), mark it with @async_unsafe:

from fastapi_async_safe import async_unsafe

@async_safe
class BaseParams:
    pass

@async_unsafe  # Will be sent to thread pool
class HeavyParams(BaseParams):
    def __init__(self):
        self.data = some_blocking_operation()

Global Opt‑In

init_app(app, all_classes_safe=True)  # Treat all class‑based dependencies as async‑safe
# Use @async_unsafe only for exceptions

Using with Synchronous Functions

The decorator also works with plain functions:

from fastapi_async_safe import async_safe

@async_safe
def get_common_params(q: str | None = None, skip: int = 0, limit: int = 100) -> dict:
    return {"q": q, "skip": skip, "limit": limit}

@app.get("/items/")
async def read_items(params: dict = Depends(get_common_params)):
    return params

Benchmarks & Results

ScenarioPerformance Gain
Single class dependency per endpoint15–25% ↑ requests/sec
Multiple class dependencies40–60% ↑ requests/sec
1000+ concurrent requests (p95)30–50% ↓ latency
Thread‑pool saturation eliminated

Best Practices

When to Use @async_safe

✅ Simple data classes
✅ Parameter‑validation classes
✅ Configuration objects
✅ Non‑blocking utility functions
✅ Pydantic model wrappers

Do NOT use for:

  • Database queries
  • File I/O
  • External API calls
  • CPU‑intensive calculations
  • Anything that truly blocks the event loop

Adoption Strategy

  1. Start Small – Apply to your most‑called endpoints.
  2. Monitor – Verify that latency improves and no regressions appear.
  3. Expand – Gradually mark more dependencies as async‑safe.
  4. Consider Global Opt‑In – Once confident, use all_classes_safe=True.

Testing

Existing tests remain unchanged:

import pytest
from fastapi.testclient import TestClient

def test_endpoint():
    client = TestClient(app)
    response = client.get("/items/?q=test&limit=50")
    assert response.status_code == 200
    assert response.json()["q"] == "test"

Caveats

  • Premature optimization – Only adopt if you observe performance issues.
  • Blocking dependencies – Keep them in the thread pool (@async_unsafe).
  • Profile first – Use tools like uvicorn --log-level debug or external profilers to confirm bottlenecks before applying the library.
Back to Blog

Related posts

Read more »