The Future for Tyr, a Rust GPU Driver for Arm Mali Hardware

Published: (February 12, 2026 at 09:17 AM EST)
8 min read

Source: Hacker News

Did you know…?

LWN.net is a subscriber‑supported publication; we rely on subscribers to keep the entire operation going. Please help out by buying a subscription and keeping LWN on the net.

The team behind Tyr started 2025 with little to show in our quest to produce a Rust GPU driver for Arm Mali hardware. By the end of the year we were able to play SuperTuxKart (a 3‑D open‑source racing game) at the Linux Plumbers Conference (LPC). Our prototype was a joint effort between Arm, Collabora, and Google; it ran well for the duration of the event, and the performance was more than adequate for players.

Thankfully, we picked up steam at precisely the right moment: Dave Airlie just announced in the Maintainers Summit that the DRM subsystem is only “about a year away” from disallowing new drivers written in C and requiring the use of Rust. Now it is time to lay out a possible roadmap for 2026 in order to upstream all of this work.

What are we trying to accomplish with Tyr?

Miguel Ojeda’s talk at LPC this year summarized where Rust is being used in the Linux kernel, with drivers like the anonymous shared‑memory subsystem for Android (ashmem) quickly being rolled out to millions of users. Given Mali’s extensive market share in the phone market, supporting this segment is a natural aspiration for Tyr, followed by other embedded platforms where Mali is also present.

In parallel, we must not lose track of upstream: the objective is to evolve together with the Nova Rust GPU driver and ensure that the ecosystem will be useful for any new drivers that might come in the future. The prototype was meant to prove that a Rust driver for Arm Mali could come to fruition with acceptable performance, but now we should iterate on the code and refactor it as needed. This will allow us to learn from our mistakes and settle on a design that is appropriate for an upstream driver.

What is there, and what is not

  • A version of the Tyr driver was merged for the 6.18 kernel release, but it is not capable of much, as a few key Rust abstractions are missing.
  • The downstream branch (the parts of Tyr not yet in the mainline kernel) houses our latest prototype; it is working well enough to run desktop environments and games, even if there are still power‑consumption and GPU‑recovery problems that need to be fixed.
  • The prototype will serve the purpose of guiding our upstream efforts and let us experiment with different designs.

A kernel‑mode GPU driver such as Tyr is a small component backing a much larger user‑mode driver that implements a graphics API like Vulkan or OpenGL. The user‑mode driver translates hardware‑independent API calls into GPU‑specific commands that can be used by the rasterisation process. The kernel’s responsibility centers around:

  • sharing hardware resources between applications,
  • enforcing isolation and fairness, and
  • keeping the hardware operational (providing GPU memory, notifying when submitted work finishes, and exposing a way for userspace to describe dependency chains between jobs).

Our talk (YouTube video) at LPC 2025 goes over this in detail.

SuperTuxKart running on Tyr at LPC

Having a working prototype does not mean it’s ready for real‑world usage. A quick walkthrough of what is missing reveals why.

  • Power management – Mali GPUs are usually found on mobile devices where power is at a premium. Conserving energy and managing thermal characteristics is paramount to user experience, yet Tyr currently has no power‑management or frequency‑scaling code. Rust abstractions to support these features are not available at all.
  • GPU recovery – If the GPU hangs, the system must remain functional to the extent possible; otherwise users may lose all of their work. Our prototype lacks any GPU‑recovery code. These two items are hard requirements for deployability: a driver that drains the battery or crashes the system cannot be shipped.

On top of that, Vulkan must be correctly implementable on top of Tyr, or we may fail to achieve drop‑in compatibility with our Vulkan driver (PanVK). This requires passing the Vulkan Conformance Testing Suite when using Tyr instead of the C driver. Once that is achieved we will feel confident enough to add support for more GPU models beyond the currently supported Mali‑G610.

Finally, we will turn our attention to benchmarking to ensure that Tyr can match the C driver’s performance while benefiting from Rust’s safety guarantees. We have demonstrated running a complex game with acceptable performance, so results are good so far.

Which Rust abstractions are missing?

Some required Rust infrastructure is still work‑in‑progress:

  • Graphics Execution Manager (GEM) shmem objects – Lyude Paul’s work on GEM shmem objects is needed to allocate memory for systems without discrete video RAM. This is especially relevant for Tyr, as the GPU is packaged in a larger system‑on‑chip and must share system memory.
  • Lock‑free buffer region sharing – There are open questions about how to share non‑overlapping regions of a GPU buffer without locks, preferably encoded in the type system and checked at compile time.
  • GPUVM support – Modern kernel drivers must let the user‑mode driver manage its own view of the GPU address space. In the DRM ecosystem this is delegated to GPUVM, which contains the common code to manage those address spaces on hardware that offers it.

These pieces, together with the power‑management and recovery mechanisms mentioned above, form the core of the roadmap for bringing Tyr to a production‑ready, upstream‑ready state in 2026.

# Rust GPU Driver Progress – Tyr

## Memory Isolation & GPUVM

- Modern CPUs provide robust memory‑isolation capabilities, and the GPU firmware expects similar control over the placement of certain sections in memory.  
- **Alice Ryhl** is working on:
  - Rust abstractions for **GPUVM**.
  - `io-pgtable` abstractions needed to manipulate the **IOMMU** page tables that enforce memory isolation.  

These efforts build on the earlier work of **Asahi Lina**, who pioneered the first Rust abstractions for the DRM subsystem.

---

## Unresolved DRM Device‑Initialization Issue

The current code requires an initializer for the driver’s private data to return a
[`drm::Device`](https://rust.docs.kernel.org/kernel/drm/device/struct.Device.html) instance.  
However, some drivers need the `drm::Device` to construct that private data, creating a circular dependency that cannot be satisfied.

- **Tyr** is affected because:
  - Allocating GPU memory through the **GEM shmem** API needs a `drm::Device`.
  - Some fields in Tyr’s private data must store GEM objects (e.g., for parsing and booting firmware).

**Paul Lyude** is addressing this by introducing a `drm::DeviceCtx` that encodes the device state in the type system.

> **Status:** Most of the roadmap remains blocked on **GEM shmem**, **GPUVM**, **io‑pgtable**, and the device‑initialization issue.

There is also an opportunity to integrate work from the **Nova** team:
- The [`register!`](https://lwn.net/ml/all/20260126-register-v3-0-2328a59d7312@nvidia.com/) macro.
- The [`bounded`](https://lwn.net/ml/rust-for-linux/20251108-bounded_ints-v4-0-c9342ac7ebd1@nvidia.com/) integer types.

Once these pieces are in place, we expect to boot the GPU firmware quickly and proceed without further roadblocks until job‑submission discussions begin.

---

## Fence Handling & Synchronization

Fences are synchronization primitives that GPU drivers signal once jobs finish executing. Proper handling is critical:

- Paths that complete **fences** must be carefully annotated; otherwise, the system may deadlock.
- Only safe locks may be taken in the signaling path.
- **DMA fences must always signal in finite time**; otherwise, other parts of the system could block forever.
- Memory allocation in these paths must use `GFP_ATOMIC`. Allocating with any other flag may allow the shrinker to run under memory pressure, potentially waiting on the very job that triggered it.

All of this is covered in the kernel documentation:
- [DMA‑Fence Cross‑Driver Contract](https://docs.kernel.org/driver-api/dma-buf.html#dma-fence-cross-driver-contract).

> **Current prototype:** Ignores these constraints, so it can randomly deadlock under memory pressure.  
> **Fix:** Systematically vet the critical sections of the driver. Doing this elegantly—potentially leveraging Rust’s type system—remains an open discussion.

Looking into the Future

Rethinking Job‑Submission Logic

The existing design assumes the use of drm_gpu_scheduler, but this has become a hindrance:

  • Some drivers now rely on GPU firmware to schedule jobs.
  • drm_gpu_scheduler suffers from hard‑to‑solve lifetime problems.

At the X.Org Developer’s Conference 2025, we discussed alternatives. The emerging consensus for Rust is to create a new component that:

  1. Ensures job dependencies are satisfied before a job becomes eligible for placement in the GPU’s ring buffer.
  2. Hands off scheduling to the firmware once the job is ready.

Because this component does not schedule jobs itself, it will likely be called JobQueue. This name conveys a queue where work is deposited and removed once dependencies are met.

  • Philip Stanner is spearheading the JobQueue effort.

C‑Driver Interoperability

The plan includes exposing an API for C drivers using a technique described in a previous article. This could become the first Rust kernel component usable from C drivers, marking a milestone for Rust in the kernel and demonstrating seamless C–Rust interoperability.

Tyr as a Testbed

Tyr can serve as a testbed for the new design:

  • If we can replace the old drm_gpu_scheduler with the JobQueue in the prototype, it will validate the approach for more complex drivers like Nova.
  • Expect continued discussion on this topic.

Summary

  • Progress: Significant advances in Rust abstractions for GPUVM, IOMMU page tables, and device‑state encoding.
  • Blockers: GEM shmem, GPUVM, io‑pgtable, and the DRM device‑initialization cycle.
  • Next Steps: Integrate Nova macros, resolve fence handling, develop JobQueue, and expose a C‑driver API.
  • Outlook: Tyr has made substantial progress this past year and is poised to continue advancing through 2026 and beyond.

Index entries for this article

0 views
Back to Blog

Related posts

Read more »