I Built a Free Synthetic Data Generator — Here's How (React + Tailwind)
Source: Dev.to

Overview
We’ve all been there — you need 10,000 realistic user records to test your app, or a batch of fake healthcare data for a demo, or transaction logs to stress‑test a dashboard. So you either:
- Write a janky script that generates “User_1, User_2, User_3…”
- Spend 30 minutes configuring a CLI tool
- Use a SaaS tool that limits you to 100 rows on the free tier
I got tired of this, so I built DataForge — a free, browser‑based synthetic data generator that creates realistic fake data instantly. No signup, no server, no limits.


What It Does
DataForge generates realistic fake data across 9 data types in two categories.
📊 General Data
- Users – Names, emails, phones, DOB, company, job title
- Addresses – Street, city, state, ZIP, coordinates
- Transactions – Amounts, merchants, categories, status
🏥 Healthcare / HIPAA‑Safe
- Patients – MRN, blood type, allergies, conditions, insurance
- Medical Records – ICD‑10 codes, vitals, visit types, clinical notes
- Prescriptions – Real medications, dosages, DEA numbers, NDC codes
- Lab Results – 26 real lab tests with reference ranges and flags
- Insurance Claims – Charged/allowed/paid amounts, claim status
- Healthcare Providers – NPI numbers, specialties, credentials
All data is 100 % synthetic — no real patient data, no HIPAA risk.
Key Features
⚡ Generate Up to 50,000 Records Instantly
Everything runs client‑side. No API calls, no server. Your data never leaves your browser.
🎬 Custom Scenarios
Preset Scenarios
- 🧓 Elderly Patient Cohort (ages 65+)
- 👶 Pediatric Cohort (ages 0‑17)
- 🚨 Critical Lab Values
- 💰 High‑Value Transactions ($5K+)
- 🕵️ Fraud Patterns
- ❌ Denied Claims Batch
- 🗑️ Dirty/Messy Data (with nulls and errors)
Custom Builder
- Set null rates per field (0‑80 %)
- Define value ranges (e.g., age 65‑95, amount $10K+)
- Force specific values (e.g.,
status = "Denied") - Add custom value pools
- Control duplicate rates and error injection
📤 Export Formats
- JSON – Standard structured data
- CSV – For spreadsheets and databases
- SQL – Ready‑to‑run INSERT statements
- HL7 FHIR – Healthcare interoperability standard
🌱 Reproducible with Seeds
Set a seed value and get the exact same data every time. Perfect for consistent test suites.
How I Built It
Tech Stack
- React 18 + TypeScript
- Tailwind CSS – Dark HD interface
- Vite – Fast builds
- Custom seeded PRNG – No external faker library needed
The Seeded Random Number Generator
Instead of using Math.random() (which isn’t seedable), I built a custom PRNG based on a simple linear congruential generator:
class SeededRandom {
private seed: number;
constructor(seed: number) {
this.seed = seed;
}
next(): number {
this.seed = (this.seed * 16807 + 0) % 2147483647;
return this.seed / 2147483647;
}
nextInt(min: number, max: number): number {
return Math.floor(this.next() * (max - min + 1)) + min;
}
pick<T>(array: T[]): T {
return array[this.nextInt(0, array.length - 1)];
}
}
If you work in Health IT, you know the pain:
- ❌ You can’t use real patient data for testing (HIPAA)
- ❌ Epic/Cerner sandboxes have limited test patients
- ❌ Synthea is powerful but requires Java + CLI setup
- ❌ Most online generators don’t understand healthcare data
DataForge fills this gap:
- ✅ FHIR‑native export — Generate valid FHIR Bundles
- ✅ Real ICD‑10 & CPT codes — Not random strings
- ✅ Clinical scenarios — Elderly cohorts, critical labs, denied claims
- ✅ Runs in the browser — Share the URL with your QA team
- ✅ 50 K records — Enough for load testing
What’s Next
- Custom schema builder (define your own data types)
- API endpoint mode (use as a mock API)
- Relationships between tables (foreign keys)
- More healthcare standards (HL7v2 messages, C‑CDA)
- Localization (non‑US names, addresses, phone formats)
If this tool saves you time, drop a ⭐ on the repo or leave a comment. I’d love to hear how you’re using it!