How I Built a Real-Time Hand Tracking Platform That Runs Entirely in the Browser

Published: (January 13, 2026 at 01:50 AM EST)
2 min read
Source: Dev.to

Source: Dev.to

The Problem with Traditional Hand Tracking

  • Installing Python, OpenCV, and a dozen other dependencies
  • Asking users to trust a native‑app download
  • Wrestling with webcam permissions across different OSes
  • Spinning up servers just to process hand movements
  • Watching latency kill the “real‑time” experience

A Browser‑Only Solution

What if hand tracking just worked in a browser tab?

// Detect pinch gesture
const distance = Math.sqrt(
  Math.pow(thumbTip.x - indexTip.x, 2) +
  Math.pow(thumbTip.y - indexTip.y, 2)
);
const angle = Math.min(distance * 500, 90);

That’s real code mapping a pinch gesture to control animations. MediaPipe handles detection, Three.js renders the visuals.

  • Browser loads MediaPipe – No install, just a CDN import
  • Webcam feeds landmarks – 21 hand points tracked in real‑time
  • JavaScript maps gestures – Translate finger positions into actions
  • Three.js renders output – Smooth 3D visualization at 60 fps

The entire pipeline runs client‑side. Your webcam feed never leaves your device.

Browser SDK (zero install)

  1. Visit the Playground → Select an effect → Grant camera → Done

Cloud API

curl -X POST https://visionpipe3d.quochuy.dev/api/detect \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "image=@hand.jpg"

New accounts get 100 free API credits—no credit card required.

Browser SDK vs. Cloud API

FeatureBrowser SDKCloud API
Hand Landmark Detection
Real‑time Tracking❌ (batch only)
Offline Support✅ (after load)
Processing LocationClient‑sideServer‑side
Latency~16 msNetwork dependent
PrivacyData stays localProcessed & discarded

Pricing

PlanCreditsCostPer Credit
Free100$0
Starter1,000$10$0.010
Growth5,000$40$0.008
Scale20,000$150$0.0075
  • 1 credit = 1 API call
  • Credits never expire.
  • The Playground is always free.

Getting Started

Browser SDK

  1. Open the Playground URL.
  2. Grant camera access.
  3. Choose an effect and start detecting.

Zero friction—no Python, no installs, no “works on my machine”.

Cloud API

# Get your API key from the console
curl -X POST https://visionpipe3d.quochuy.dev/api/detect \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "image=@sample.jpg"

The 100 free credits are enough to build a proof‑of‑concept.

Resources

  • GitHub:
  • Website:
  • Documentation:

Call to Action

What gesture‑based interfaces have you built or wanted to build? I’m curious about the use cases people are exploring beyond the typical “air guitar” demos.

Back to Blog

Related posts

Read more »

𝗗𝗲𝘀𝗶𝗴𝗻𝗲𝗱 𝗮 𝗣𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻‑𝗥𝗲𝗮𝗱𝘆 𝗠𝘂𝗹𝘁𝗶‑𝗥𝗲𝗴𝗶𝗼𝗻 𝗔𝗪𝗦 𝗔𝗿𝗰𝗵𝗶𝘁𝗲𝗰𝘁𝘂𝗿𝗲 𝗘𝗞𝗦 | 𝗖𝗜/𝗖𝗗 | 𝗖𝗮𝗻𝗮𝗿𝘆 𝗗𝗲𝗽𝗹𝗼𝘆𝗺𝗲𝗻𝘁𝘀 | 𝗗𝗥 𝗙𝗮𝗶𝗹𝗼𝘃𝗲𝗿

!Architecture Diagramhttps://dev-to-uploads.s3.amazonaws.com/uploads/articles/p20jqk5gukphtqbsnftb.gif I designed a production‑grade multi‑region AWS architectu...