Leveraging io_uring for performant asynchronous linux applications.
Source: Dev.to
author: Sospeter Kinyanjui
Intro
For the longest time, Linux only offered epoll, an I/O‑notification facility that lets applications make read/write system calls to the kernel.
epoll first appeared in Linux 2.5.44 (2002) and became mainstream with 2.6 (2003). It uses the readiness model via the three system calls epoll_create, epoll_ctl, and epoll_wait. The kernel notifies applications when resources are ready, allowing the apps to submit work.
Because the kernel notifies only when something is ready, the model has O(1) complexity – the cost is the same whether you watch 10 connections or 10 000. However, every notification still requires a system call, which means a costly syscall tax: a context‑switch from user mode to kernel mode for each event.
It wasn’t until 2019 that io_uring arrived, providing a Linux kernel interface for truly asynchronous I/O with far fewer system calls.
What is asynchronous execution?
The ability of an application to start a long‑running task and continue executing other work without waiting for that task to finish.
Asynchronous execution makes better use of CPU and I/O resources. While epoll is event‑driven (and thus an illusion of asynchrony), io_uring actually batches multiple I/O requests and submits them with a single system call, allowing reads and writes to proceed independently.
Definition and Implementation
io_uring exposes three system calls:
| Call | Purpose |
|---|---|
io_uring_setup(2) | Creates the submission queue (SQ) and completion queue (CQ) and returns a file descriptor. It configures the ring buffers (head, tail, ring_mask, ring_entries). |
io_uring_enter(2) | Tells the kernel “I have placed SQEs in the ring; go process them.” |
io_uring_register(2) | Pre‑registers resources (e.g., buffers, files) with the kernel to avoid per‑request look‑ups. |
io_uring_setup
- Allocates a shared memory region that holds the SQ and CQ structures.
- The user space side gets write permission on the SQ (the kernel reads it).
- The kernel gets write permission on the CQ (the user reads it).
- The design follows a single‑producer / single‑consumer model for maximum performance.
io_uring_enter
The “engine starter” for the whole operation. Its prototype:
#include
int io_uring_enter(unsigned int fd,
unsigned int to_submit,
unsigned int min_complete,
unsigned int flags,
sigset_t *sig);
Calling io_uring_enter notifies the kernel that to_submit SQEs are ready for processing.
io_uring_register
The “VIP pass” for your data. By pre‑registering buffers or files, the kernel can use them directly without extra look‑ups or mappings, eliminating a lot of overhead.
Ecosystem and Language Bindings
C developers can use the official liburing library, which wraps the three syscalls and provides helper functions.
Rust also has strong support for io_uring, offering memory‑safety guarantees that prevent the classic “danger zone” where both the kernel and the application might access the same buffer simultaneously. The Rust compiler ensures that a buffer cannot be touched by user code until the kernel returns it in a CQE.
Popular Rust crates include:
tokio-uring– integratesio_uringwith the Tokio async runtime.glommio– a thread‑per‑core framework built on top ofio_uring.- Others:
io-uring,uring-sys, etc.
There’s More
Completion‑based I/O isn’t unique to Linux:
| OS | Mechanism | Characteristics |
|---|---|---|
| Windows | I/O Completion Ports (IOCP) | Asynchronous but still requires a system call per request, leading to higher syscall overhead than io_uring. |
| macOS | kqueue | Readiness‑based; you must call kevent to discover readiness and then issue separate syscalls for the actual I/O, incurring the same syscall tax io_uring was designed to eliminate. |
io_uring therefore represents a true asynchronous programming model for Linux, minimizing the number of system calls and context switches required for high‑performance I/O.
“Be the best cog, but keep in mind you’re not the only one.” – a reminder that solving every problem isn’t necessary; sometimes the right tool (like
io_uring) is enough to make a big difference.
The Real Problem: A Cross‑Platform, Completion‑Based Asynchronous Runtime
From my own point of view, I think the real problem lies in creating an asynchronous runtime that is cross‑platform and completion‑based. Such a technology already exists: we have compio, a Rust framework for asynchronous I/O operations.
What’s Missing?
- Zero‑cost abstraction – Compio arguably does not provide a true zero‑cost abstraction.
- Fixed buffers – It uses fixed buffers (a design choice of
io_uring) which are immutable references.
Most of the Rust ecosystem is built on top of the std::io::Read and std::io::Write traits, which expect mutable references. Compio, on the other hand, emphasizes ownership of buffers rather than borrowing them. This aligns well with the io_uring completion‑based model, but it creates a real integration problem with the rest of the ecosystem.
“But again, like I said, we just have to be here, implementing one solution at a time. By believing in ourselves even when it seems impossible. Until next time, peace, focus, desire.”
Stay Connected
You can check out other posts on my blog.
- GitHub:
- LinkedIn:
