Is asyncio Really Better Than Multithreading? I Tested 100 Concurrent Requests, and the Difference Is Huge
Source: Dev.to
Introduction
Last month, the data platform I maintain suddenly got a new requirement: run health checks against 100+ downstream services. Each endpoint averages 200 ms, and the whole check had to finish within 5 seconds. Without thinking twice, I fired up 100 threads. The thread‑switching overhead immediately maxed out the CPU, and the response time shot past 8 seconds. My ops teammate dropped three question marks in the group chat.
That moment forced me to seriously re‑examine asyncio. I used to think async programming had a steep learning curve and was a magnet for bugs, but after a thorough benchmark I can only say: for I/O‑bound workloads, asyncio and multithreading aren’t even in the same league. Here’s the full breakdown of running the same task with three different strategies—synchronous, multithreaded, and asyncio—head‑to‑head.
We spun up a mock downstream service with FastAPI. The /health endpoint deliberately sleeps for 200 ms and then returns {"status": "ok"}. The client fires 100 concurrent requests using three different approaches, and we measure total elapsed time and resource usage.
Synchronous Demo (sync_demo.py)
# sync_demo.py — 同步请求,一个接一个
import time
import requests
URLS = [f"http://localhost:8000/health" for _ in range(100)]
def check_sync():
results = []
for url in URLS:
resp = requests.get(url, timeout=5)
results.append(resp.json())
return results
if __name__ == "__main__":
start = time.perf_counter()
check_sync()
elapsed = time.perf_counter() - start
print(f"同步耗时: {elapsed:.2f}s") # 20.3s 左右
Unsurprisingly, 100 × 200 ms = 20 seconds. The thread spends all its time waiting for network I/O while the CPU sits nearly idle. That’s with only 100 requests; at 1 000 the system would be effectively frozen for three minutes with zero concurrency.
Multithreaded Demo (thread_demo.py)
# thread_demo.py — 100 个线程并发
import time
import requests
from concurrent.futures import ThreadPoolExecutor, as_completed
URLS = [f"http://localhost:8000/health" for _ in range(100)]
def fetch(url):
return requests.get(url, timeout=5).json()
def check_thread():
results = []
with ThreadPoolExecutor(max_workers=100) as executor:
futures = {executor.submit(fetch, url): url for url in URLS}
for future in as_completed(futures):
results.append(future.result())
return results
if __name__ == "__main__":
start = time.perf_counter()
check_thread()
elapsed = time.perf_counter() - start
print(f"多线程耗时: {elapsed:.2f}s") # 第一次 8.5s,后来波动在 3~6s
The first run took 8.5 seconds, with CPU usage instantly spiking to 90 %. Python’s GIL is released during I/O, but creating 100 threads, the constant context switching, and lock contention add enormous overhead. When I dialed max_workers down to 30, the time dropped to 2.1 seconds and the CPU settled down—but that turns into “tuning by gut feeling,” and as soon as the thread count rises, the system becomes unstable again.
There’s an even sneakier trap: the requests library isn’t the most thread‑safe choice; its connection‑pool reuse is limited, and occasionally it throws a ConnectionResetError that’s a nightmare to debug.
Asyncio Demo (async_demo.py)
# async_demo.py — 使用 asyncio 和 aiohttp 并发请求
import asyncio
import time
import aiohttp
URLS = [f"http://localhost:8000/health" for _ in range(100)]
async def fetch(session, url):
try:
async with session.get(url, timeout=5) as resp:
return await resp.json()
except Exception as e:
return {"error": str(e)}
async def check_async():
async with aiohttp.ClientSession() as session:
tasks = [fetch(session, url) for url in URLS]
results = await asyncio.gather(*tasks)
return results
if __name__ == "__main__":
start = time.perf_counter()
asyncio.run(check_async())
elapsed = time.perf_counter() - start
print(f"asyncio 耗时: {elapsed:.2f}s") # 稳定在 0.45~0.60s
All 100 tasks are scheduled inside a single event loop and dispatched asynchronously. The total elapsed time is determined only by the slowest I/O call, consistently coming in under 0.6 seconds. CPU usage never topped 15 %, and memory usage stayed almost perfectly flat. When my boss saw the results on the monitoring dashboard he asked if I had secretly added more servers—turns out I just rewrote the code with async/await.
Takeaway
Mixing synchronous code into a coroutine instantly kills performance. At first I put requests.get() directly inside an async def and… (the rest of the story continues).