I tried parsing emails with regex. It went exactly how you think.

Published: (March 2, 2026 at 10:24 AM EST)
3 min read
Source: Dev.to

Source: Dev.to

Introduction

Recently I needed to process incoming emails automatically.

The idea sounded simple:

Email arrives → extract some fields → trigger a webhook

Things like:

  • order confirmations
  • invoice emails
  • shipping notifications
  • support messages

Nothing complicated. Or so I thought.

Attempt #1 — Regex

Like most developers, I started with regex.

const price = email.match(/Total:\s\$(\d+)/)

For the first email it worked perfectly. Then the next email came in and said:

Amount paid: $29

Another one said:

Total price: USD 29

Then an HTML email arrived with nested tables, inline styles, and formatting from what looked like 2004 Outlook templates.

At this point my regex slowly evolved into something like this:

/(Total|Amount|Price).*?(\$|USD)?\s?(\d+(\.\d+)?)/ 

Which is usually the moment you realize the approach is already doomed.

Attempt #2 — Parsing the HTML

Okay fine. Let’s parse the HTML instead.

const dom = new JSDOM(emailHtml)

Which sometimes worked. Except email HTML is a special kind of chaos:

  • tables inside tables
  • inline styles everywhere
  • different layouts for every sender

And suddenly you’re maintaining custom parsers for every email format.

The real problem

Emails aren’t structured data. They’re written for humans, not machines. Every sender formats them differently, and trying to enforce rigid parsing rules becomes fragile very quickly.

The obvious solution (in hindsight)

Instead of trying to force strict parsing rules, why not let AI interpret the email and extract the fields you want?

Example email:

Subject: Order confirmation
Customer: John Smith
Product: T-shirt
Total: $39

Structured output:

{
  "customer": "John Smith",
  "product": "T-shirt",
  "total": 39
}

Now your backend receives clean structured data instead of raw email text.

So I built a small tool

Mostly because I kept running into this problem again and again. It’s called ParseForce.

The flow is simple:

Incoming email → AI parsing → structured JSON → webhook

You:

  • Get a unique inbox
  • Send emails to it
  • Define the schema you want

Receive structured JSON in your webhook. That’s it.

Some things it works well for

So far I’ve been using it for:

  • parsing order confirmation emails
  • extracting invoice data
  • processing lead emails
  • triggering automation workflows

Basically anything where an email contains data you want your system to understand.

If you’re curious

You can check it out here: 👉 https://parseforce.io

I’m also curious how others deal with this problem. Are you using regex, templates, or something else entirely?

Tags: node, webdev, saas, automation, ai

0 views
Back to Blog

Related posts

Read more »

测试文章1DEV.to专属

!Cover image for 测试文章1DEV.to专属https://media2.dev.to/dynamic/image/width=1000,height=420,fit=cover,gravity=auto,format=auto/https%3A%2F%2Fexample.com%2Fimage1.jp...