GPU Compute Shaders in Pure Go: gogpu/gg v0.15.0

Published: (December 25, 2025 at 09:13 PM EST)
3 min read
Source: Dev.to

Source: Dev.to

GPU Compute Shaders in Pure Go: gogpu/gg v0.15.0

Two days ago we shipped gogpu/gg v0.14.0 with alpha masks and a fluent PathBuilder. Yes—two days, we’re moving fast.

Looking at performance profiles, we saw a problem:

The CPU was the bottleneck.

Our GPU could render millions of pixels in milliseconds, but the CPU spent a lot of time tessellating paths. This is the classic 2‑D graphics problem: CPU tessellation doesn’t scale.

So we moved the entire rasterisation pipeline to GPU compute shaders.

Today, gogpu/gg v0.15.0 is here: 2 280 lines of WGSL compute shaders, vello‑style pipeline, dramatic speed‑ups for complex scenes. All in Pure Go.

The Performance Challenge

// Drawing 10 000 circles
ctx := gg.NewContext(800, 600)
for i := 0; i 
) {
    let curve = curves[id.x];

    // Adaptive subdivision based on curvature
    let segments = subdivide_bezier(curve);

    // Write to global buffer (thousands in parallel!)
    for (var i = 0u; i 
) {
    let segment = segments[id.x];
    let bounds   = segment_bounds(segment);

    // Find overlapping tiles
    for (var y = tile_min.y; y 
) {
    let tile_id = (pixel.y / TILE_SIZE) * tile_width + (pixel.x / TILE_SIZE);

    var coverage = 0.0;
    for (var i = 0u; i (color.rgb, saturate(coverage));
}

Perfect anti‑aliasing at any scale—no jaggies, no MSAA overhead.

Expected Performance Gains

| Workload | Expected Behaviour |
|----------|---------------------|
| Simple paths (= h.segmentThreshold {        |
|        // GPU path: dispatch compute shaders |
|        h.gpu.Rasterize(coarse, segments, backdrop, scene.FillNonZero) |
|    } else { |
|        // CPU path: software rasterization |
|        h.cpu.RasterizeSegments(segments, backdrop) |
|    } |
if h.segmentCount < h.segmentThreshold {
    // GPU path: dispatch compute shaders
    h.gpu.Rasterize(coarse, segments, backdrop, scene.FillNonZero)
} else {
    // CPU path: software rasterization
    h.cpu.RasterizeSegments(segments, backdrop)
}

Why? Small paths (andatomic` are supported.

  • Specific memory orders are required.
  • Buffer layout matters.
// This works
@group(0) @binding(1) var counts: array>;
atomicAdd(&counts[i], 1u);

// This doesn't
var counts: array; // Not atomic!

Debugging was tricky – WGSL validation errors are… cryptic.

What We Shipped

Statistics

  • 2 280 LOC WGSL shaders (8 shader files)
  • ~20 K LOC Go in backend/wgpu/
  • 74 % test coverage overall
  • 0 linter issues

Shader Files

backend/wgpu/shaders/
├── flatten.wgsl     # 589 LOC — Bezier curve flattening
├── coarse.wgsl      # 335 LOC — Tile binning with atomics
├── fine.wgsl        # 290 LOC — Per‑pixel coverage
├── blend.wgsl       # 424 LOC — 29 blend modes on GPU
├── composite.wgsl   # 235 LOC — Layer compositing
├── strip.wgsl       # 155 LOC — Sparse strip rendering
├── blit.wgsl        #  43 LOC — Final output blit
└── msdf_text.wgsl   # 209 LOC — MSDF text rendering

Go Implementation

backend/wgpu/
├── gpu_flatten.go       # 809 LOC — Flatten pipeline
├── gpu_coarse.go        # 698 LOC — Coarse rasterization
├── gpu_fine.go          # 752 LOC — Fine rasterization
├── sparse_strips_gpu.go # 837 LOC — Hybrid CPU/GPU selection
├── renderer.go          # 822 LOC — Main renderer
├── pipeline.go          # 369 LOC — Pipeline orchestration
├── memory.go            # 413 LOC — GPU memory management
└── ... (40+ files total)

Try It Yourself

Installation

go get github.com/gogpu/gg@v0.15.0

Quick Example

package main

import "github.com/gogpu/gg"

func main() {
    ctx := gg.NewContext(512, 512)
    ctx.ClearWithColor(gg.White)

    // 1 000 circles — GPU backend handles complex scenes efficiently
    ctx.SetColor(gg.Hex("#e74c3c"))
    for i := 0; i   
}

From CPU bottleneck to GPU parallelism. From sequential tessellation to massively parallel compute shaders.

This is what Pure Go can do.

go get github.com/gogpu/gg@v0.15.0

⭐ Star the repo if you find it useful!

Part of the GoGPU Journey series

GPU Compute Shaders in Pure Go ← You are here

Back to Blog

Related posts

Read more »

An Honest Review of Go (2025)

Article URL: https://benraz.dev/blog/golang_review.html Comments URL: https://news.ycombinator.com/item?id=46542253 Points: 58 Comments: 50...