How I Found 1,370 Fraudsters Hiding in Our Data (And Saved My Company $51,000)

Published: 1 month ago (December 29, 2025 at 06:57 PM EST)

4 min read

Source: Dev.to

Source: Dev.to – How I found 1,370 fraudsters hiding in our data and saved my company $51,000

The First Clue: When Numbers Tell a Story

Opening the data felt like looking at two different worlds.

Credit‑card transactions: fraud in only 0.5 % of cases – tiny red dots in a sea of green.
E‑commerce platform: nearly 1 in 3 transactions were fraudulent.

I remember thinking,

“How are we even still in business?”

That’s when I built my first visualization—side‑by‑side bars showing the stark difference. Seeing it visually made the problem real; it wasn’t just numbers anymore, it was a pattern screaming for attention.

Side‑by‑side bar chart of fraud rates in credit‑card vs. e‑commerce data

The Breakthrough: The 1‑Hour Rule

It started as a hunch: “What if fraudsters work fast?”

I created a simple calculation: hours between account creation and first purchase. When I plotted it, my coffee went cold.

There it was—a massive spike at the beginning. Transactions within the first hour had a 99.5 % fraud rate—6 685 cases of “sign up, steal, disappear.”

The visualization looked like a mountain with the peak shoved all the way to the left. It was so clear, so obvious. How had we missed this?

Histogram of fraud rate by hours since account creation

Building the Fraud Catchers

Channel	Model	Reason	Results
Credit‑card	XGBoost	Powerful ensemble that learns complex interactions	76 fraudsters caught, 15 false alarms
E‑commerce	Logistic Regression	High interpretability for customer‑facing decisions	1 370 frauds caught (vs. 1 409 possible) with clear explanations

My model‑comparison chart tells the story—different problems need different tools.

Model comparison chart for credit‑card and e‑commerce fraud detection

The Most Fascinating Part: Asking “Why?”

Using SHAP felt like putting on X‑ray glasses. Suddenly I could see what the model was thinking.

The top predictors weren’t what I expected. An anonymized V4 feature mattered most, followed by our custom anomaly score.
The model was finding patterns in places I hadn’t even looked.

The real magic was in the individual cases. A SHAP force plot for a caught $257 fraud let me trace exactly why—the timing, a weird V14 value, and the new account. It wasn’t magic; it was math we could explain.

SHAP force plot for a $257 fraud case

From Insights to Action: Three Changes We’re Making

The 1‑Hour Checkpoint
Starting Monday, any purchase within an hour of signup will trigger a gentle extra verification step (e.g., “Hey, confirm this is you?”). Based on our data, this alone could stop thousands of fraudulent attempts.
Smarter Geography
We found countries with shockingly high fraud rates (looking at you, Turkmenistan at 100 %). Rather than blanket blocks, we’ll add intelligent scrutiny: legitimate customers get through, fraudsters hit roadblocks.
Dynamic Decisions
Our confusion matrices showed we need different approaches.
- Credit‑card channel: prioritize precision — be super sure before flagging.
- E‑commerce channel: prioritize recall — catch more frauds while maintaining explainability.

Confusion matrices for credit‑card and e‑commerce models

The Business Impact (Or: How I Justified My Salary)

Let’s talk numbers

Test‑data impact: $51,000 saved
Monthly projection: $200,000+
Annual potential: Millions

But it’s not just about money—trust matters. We can now tell customers exactly why their transaction was flagged, eliminating the “the system says so” black‑box feeling.

The financial‑impact visualization made my case to management in 10 seconds flat.

What I Wish I Knew Then

Simple beats complex – The 1‑hour rule required no machine learning to discover.
Explainability matters – Logistic Regression won for e‑commerce because we could defend it.
Fraudsters adapt – Today’s patterns become tomorrow’s history.

The Big Realization

The most valuable insight wasn’t in the fancy algorithms. It was in asking a simple question:

“What happens right after someone signs up?”

Sometimes the most powerful data science is asking obvious questions and having the courage to believe the answers, even when they seem too simple to be true.

Want to see how we did it?
The code, the struggles, and the celebrations are all here:

Question for you: What’s the most surprising pattern you’ve found in your data?

Coffee consumption during this project: 47 cups ☕
Regrets: Zero

How I Found 1,370 Fraudsters Hiding in Our Data (And Saved My Company $51,000)

The First Clue: When Numbers Tell a Story

The Breakthrough: The 1‑Hour Rule

Building the Fraud Catchers

The Most Fascinating Part: Asking “Why?”

From Insights to Action: Three Changes We’re Making

The Business Impact (Or: How I Justified My Salary)

Let’s talk numbers

What I Wish I Knew Then

The Big Realization

Related posts

10 AI terms that will help you look like you're not completely lost

The Role of Human-in-the-Loop (HITL) in Modern AI Annotation Workflows

📌 Day 21: 21 Days of Building a Small Language Model: Complete Journey Recap: Book Giveaway📌

Data Leakage pada Machine Learning