I Vibe-Coded a GPU Accelerated Face Cropping Tool in Rust — Here’s Why
Source: Dev.to
The Problem
- Online services that upload your images to someone else’s server (a non‑starter for student data).
- Desktop tools that choke on batch jobs or produce inconsistent results.
I needed something that could handle hundreds of images locally, produce deterministic results, and do it fast.
Why Rust?
Ecosystem – Rust’s crate ecosystem now rivals Python’s in richness. Need GPU compute? There’s wgpu. Face detection inference? Build it with ndarray and image crates. GUI? egui. Batch data ingestion from CSV, Excel, Parquet, SQLite? All covered with mature, well‑maintained crates. I rarely hit a wall where I needed to write bindings or roll my own solution—the building blocks were already there.
Vibe coding – When you’re building with an LLM as your copilot, the compiler’s error messages become a superpower. Rust tells you exactly what went wrong, where, and often how to fix it. Feeding a Rust compiler error to an LLM can resolve it in one shot, something far harder with segfaults in C or runtime panics in dynamically typed languages. This tight feedback loop made me dramatically more productive than any other systems language.
The Architecture
- Face detection: YuNet, a lightweight neural network fast enough for real‑time use. I implemented the inference pipeline from scratch with custom WGSL compute shaders, avoiding a heavy ONNX Runtime dependency and keeping full control over the GPU pipeline.
- Compute shaders: Seven custom shaders handle everything from image pre‑processing to face detection inference to post‑processing enhancements. The pipeline stays on the GPU when possible, minimizing expensive CPU↔GPU data transfers.
- Enhancement pipeline: Auto colour correction, exposure, brightness, contrast, saturation, sharpening, skin smoothing, red‑eye removal, and portrait background blur. Each operation has both GPU and CPU paths with automatic fallback.
- Batch processing with data mapping: Import CSV, Excel, Parquet, or SQLite files to drive batch naming. Feed a spreadsheet of student names and photo filenames, and the tool handles the rest.
The Hard Parts
- VRAM Management
- Multi‑Face Detection
- Cross‑Platform GPU Support
What I Shipped
- 6+ crop presets: LinkedIn, Passport, Instagram, ID Card, Avatar, Headshot, plus fully custom dimensions.
- Quality scoring: Laplacian‑variance sharpness analysis categorises each crop as Low, Medium, or High quality.
- Native GUI built with egui – live preview, undo/redo, and processing history.
- CLI mode for scripting and automation.
- 4 export formats with configurable quality settings.
- MIT licensed and fully open source.
- Codebase is ~97 % Rust.
What I Learned
- Write the GPU path first – Designing around CPU processing and bolting on GPU later leads to awkward data flow and unnecessary copies. Start with GPU in mind and add CPU fallback where needed.
- Batch processing exposes every edge case – A tool that works on 10 images will find new ways to fail on 1,000. Memory leaks invisible in single‑image mode become showstoppers at scale.
- Deterministic output matters – When processing official documents like ID photos, slightly different crops from the same input are unacceptable. Achieving floating‑point reproducibility across GPU and CPU paths required real effort.
Try It Yourself
- GitHub:
- Website: