I Built a Free Synthetic Data Generator — Here's How (React + Tailwind)

Published: (February 11, 2026 at 07:13 PM EST)
4 min read
Source: Dev.to

Source: Dev.to

Cover image for I Built a Free Synthetic Data Generator — Here's How (React + Tailwind)

Overview

We’ve all been there — you need 10,000 realistic user records to test your app, or a batch of fake healthcare data for a demo, or transaction logs to stress‑test a dashboard. So you either:

  • Write a janky script that generates “User_1, User_2, User_3…”
  • Spend 30 minutes configuring a CLI tool
  • Use a SaaS tool that limits you to 100 rows on the free tier

I got tired of this, so I built DataForge — a free, browser‑based synthetic data generator that creates realistic fake data instantly. No signup, no server, no limits.

🔗 Try it live

DataForge Screenshot

DataForge UI example

What It Does

DataForge generates realistic fake data across 9 data types in two categories.

📊 General Data

  • Users – Names, emails, phones, DOB, company, job title
  • Addresses – Street, city, state, ZIP, coordinates
  • Transactions – Amounts, merchants, categories, status

🏥 Healthcare / HIPAA‑Safe

  • Patients – MRN, blood type, allergies, conditions, insurance
  • Medical Records – ICD‑10 codes, vitals, visit types, clinical notes
  • Prescriptions – Real medications, dosages, DEA numbers, NDC codes
  • Lab Results – 26 real lab tests with reference ranges and flags
  • Insurance Claims – Charged/allowed/paid amounts, claim status
  • Healthcare Providers – NPI numbers, specialties, credentials

All data is 100 % synthetic — no real patient data, no HIPAA risk.

Key Features

⚡ Generate Up to 50,000 Records Instantly

Everything runs client‑side. No API calls, no server. Your data never leaves your browser.

🎬 Custom Scenarios

Preset Scenarios

  • 🧓 Elderly Patient Cohort (ages 65+)
  • 👶 Pediatric Cohort (ages 0‑17)
  • 🚨 Critical Lab Values
  • 💰 High‑Value Transactions ($5K+)
  • 🕵️ Fraud Patterns
  • ❌ Denied Claims Batch
  • 🗑️ Dirty/Messy Data (with nulls and errors)

Custom Builder

  • Set null rates per field (0‑80 %)
  • Define value ranges (e.g., age 65‑95, amount $10K+)
  • Force specific values (e.g., status = "Denied")
  • Add custom value pools
  • Control duplicate rates and error injection

📤 Export Formats

  • JSON – Standard structured data
  • CSV – For spreadsheets and databases
  • SQL – Ready‑to‑run INSERT statements
  • HL7 FHIR – Healthcare interoperability standard

🌱 Reproducible with Seeds

Set a seed value and get the exact same data every time. Perfect for consistent test suites.

How I Built It

Tech Stack

  • React 18 + TypeScript
  • Tailwind CSS – Dark HD interface
  • Vite – Fast builds
  • Custom seeded PRNG – No external faker library needed

The Seeded Random Number Generator

Instead of using Math.random() (which isn’t seedable), I built a custom PRNG based on a simple linear congruential generator:

class SeededRandom {
  private seed: number;

  constructor(seed: number) {
    this.seed = seed;
  }

  next(): number {
    this.seed = (this.seed * 16807 + 0) % 2147483647;
    return this.seed / 2147483647;
  }

  nextInt(min: number, max: number): number {
    return Math.floor(this.next() * (max - min + 1)) + min;
  }

  pick<T>(array: T[]): T {
    return array[this.nextInt(0, array.length - 1)];
  }
}

If you work in Health IT, you know the pain:

  • ❌ You can’t use real patient data for testing (HIPAA)
  • ❌ Epic/Cerner sandboxes have limited test patients
  • ❌ Synthea is powerful but requires Java + CLI setup
  • ❌ Most online generators don’t understand healthcare data

DataForge fills this gap:

  • ✅ FHIR‑native export — Generate valid FHIR Bundles
  • ✅ Real ICD‑10 & CPT codes — Not random strings
  • ✅ Clinical scenarios — Elderly cohorts, critical labs, denied claims
  • ✅ Runs in the browser — Share the URL with your QA team
  • ✅ 50 K records — Enough for load testing

What’s Next

  • Custom schema builder (define your own data types)
  • API endpoint mode (use as a mock API)
  • Relationships between tables (foreign keys)
  • More healthcare standards (HL7v2 messages, C‑CDA)
  • Localization (non‑US names, addresses, phone formats)

If this tool saves you time, drop a ⭐ on the repo or leave a comment. I’d love to hear how you’re using it!

0 views
Back to Blog

Related posts

Read more »