How to Benchmark Web Frameworks in a Fair, Isolated Way | Mahdi Shamlou
Source: Dev.to
Hey everyone! Mahdi Shamlou here ๐
Iโve seen many posts online comparing web frameworks, but most of them are either biased, outdated, or hard to reproduce. So I wanted to share a practical way to benchmark any web framework, keeping everything isolated, fair, and reproducible.
Weโll use Docker for isolation, k6 for load testing, and Python frameworks โ FastAPI and Flask โ as simple examples. The approach works for Node.js, Go, Java, Rust, or anything else.
Overview
Benchmarking web frameworks can be tricky. Many factors affect results:
- CPU & memory availability
- Number of workers / threads
- Background processes
- Routing, logging, database, I/O
To make a fair comparison you need:
- Docker containers with fixed CPU & memory limits
- Identical routes or endpoints in each framework
- Controlled load tests using k6 (or similar tools)
- Results saved for later analysis
Project Structure
mkdir web_framework_benchmarks
cd web_framework_benchmarks
mkdir framework1 framework2 k6-tests results
You can replace framework1 and framework2 with any frameworks you want to compare. For demonstration we use a simple /hello endpoint.
FastAPI (Python)
# app.py
from fastapi import FastAPI
app = FastAPI()
@app.get("/hello")
def hello():
return {"message": "hello world"}
Flask (Python)
# app.py
from flask import Flask, jsonify
app = Flask(__name__)
@app.route("/hello")
def hello():
return jsonify({"message": "hello world"})
You can implement the same endpoint in Node.js, Go, Java, etc., keeping the functionality identical. Optionally add a sleep route to simulate I/Oโheavy work.
Dockerfiles (Fair Comparison)
FastAPI Dockerfile
# Dockerfile (FastAPI)
FROM python:3.11-slim
WORKDIR /app
COPY . .
RUN pip install fastapi uvicorn gunicorn
CMD ["gunicorn", "-k", "uvicorn.workers.UvicornWorker", "-w", "1", "-b", "0.0.0.0:8000", "app:app"]
Flask Dockerfile
# Dockerfile (Flask)
FROM python:3.11-slim
WORKDIR /app
COPY . .
RUN pip install flask gunicorn
CMD ["gunicorn", "-w", "1", "-b", "0.0.0.0:8000", "app:app"]
โ Both containers now have the same worker count and identical CPU/memory limits, ensuring a fair baseline.
Load Testing with k6
k6 Script (JavaScript)
// k6-tests/framework1.js (or framework2.js)
import http from "k6/http";
import { sleep } from "k6";
export const options = {
stages: [
{ duration: "30s", target: 50 }, // rampโup
{ duration: "1m", target: 200 }, // hold load
{ duration: "30s", target: 0 }, // rampโdown
],
thresholds: {
"http_req_duration": ["p(95)<200"], // 95% of requests should be <200โฏms
},
};
export default function () {
http.get("http://localhost:8001/hello");
sleep(1);
}
Running the Tests
mkdir -p results
k6 run --out json=results/framework1.json k6-tests/framework1.js
k6 run --out json=results/framework2.json k6-tests/framework2.js
The JSON output can be processed with any language or plotting tool.
Analyzing Results
Below is a minimal Python script that extracts average latency, 95thโpercentile latency, request count, and failure rate, then plots the average response time.
# analyze.py
import json
import numpy as np
import matplotlib.pyplot as plt
files = {
"Framework1": "results/framework1.json",
"Framework2": "results/framework2.json"
}
summary = {}
for name, file in files.items():
durations, fails, total = [], 0, 0
with open(file) as f:
for line in f:
obj = json.loads(line)
if obj.get("type") == "Point":
metric = obj.get("metric")
if metric == "http_req_duration":
durations.append(obj["data"]["value"])
if metric == "http_req_failed":
fails += obj["data"]["value"]
total += 1
if durations:
summary[name] = {
"avg": np.mean(durations),
"p95": np.percentile(durations, 95),
"requests": len(durations),
"fail_rate": fails / total if total else 0,
}
print(summary)
plt.bar(summary.keys(), [summary[n]["avg"] for n in summary])
plt.title("Average Response Time (ms)")
plt.ylabel("Milliseconds")
plt.show()
Key Takeaways
- Docker isolation makes the benchmark reproducible.
- Worker count & CPU limits must match across containers.
- Simple routes may make Flask look faster; donโt be fooled.
- Async/I/Oโheavy routes showcase the strengths of FastAPI (or other async frameworks).
- Always benchmark your actual workload, not just tiny examples.
Repository
The full example, including Dockerfiles, k6 scripts, and analysis code, is available at:
https://github.com/mahdi-shamlou/web_framework_benchmarks