Building WatchContact: An AI Chat Analyzer with OCR, Rate Limiting, and a Content Blog

Published: (March 12, 2026 at 07:19 PM EDT)
7 min read
Source: Dev.to

Source: Dev.to

What We Built

WatchContact is a two‑part product:

  1. AI Chat Analyzer – a free tool where users paste a conversation (or upload a screenshot) and receive an AI‑powered analysis that includes intent, tone, risk level, and suggested replies.
  2. Blog – articles covering messaging psychology, WhatsApp behavior, texting etiquette, and relationship signals.

The goal was to ship something useful quickly: no authentication, no signup—just paste and analyze.

Architecture Diagram

flowchart LR
    A[Next.js 14<br/>(Frontend)] --> B[Express API<br/>(Backend)]
    B --> C[MongoDB<br/>(Analytics)]

    B -->|OpenAI API| D[OpenAI Service]
    B -->|Tesseract.js (OCR)| E[OCR Service]
    B -->|Cloudflare R2| F[Object Storage]

    A -->|Markdown blog (static)| G[Static Site Generator]
  • Next.js 14 – Serves the React‑based frontend and generates the static Markdown blog.
  • Express API – Handles application logic, routes requests to external services, and talks to MongoDB.
  • MongoDB – Stores analytics and other persistent data.
  • OpenAI API – Provides AI capabilities (e.g., text generation, embeddings).
  • Tesseract.js (OCR) – Performs client‑side OCR when needed.
  • Cloudflare R2 – Object storage for assets such as images and PDFs.

Tech Stack

LayerTechnologies
FrontendNext.js 14, React, Tailwind CSS
BackendExpress, Mongoose, MongoDB
AIOpenAI GPT (conversation analysis)
OCRTesseract.js
StorageCloudflare R2 (S3‑compatible)
AuthNone – IP‑based rate limiting
DeploymentVercel / Cloudflare Pages (static blog)

User Flow

  1. Text input – The user pastes or types a conversation.

    • The request is sent directly to /api/analysis/text.
  2. Screenshot input – The user pastes an image (Ctrl + V) or uploads a file.

    • The backend runs OCR, optionally stores the image in R2, then analyzes the extracted text.

Both modes share a single input area that automatically detects whether the clipboard contains an image or plain text.

API Endpoints

MethodPathPurpose
POST/api/analysis/textAnalyze pasted text
POST/api/analysis/screenshot-extractPerform OCR and optionally upload to R2
POST/api/analysis/screenshot-finalAnalyze OCR‑extracted text (re‑uses rate limit)
GET/api/analysis/limit-statusReturn remaining analyses for the day

Rate Limiting (No Signup)

We limit each IP to 3 analyses per day.

// rateLimit.service.js – simplified
const DAILY_LIMIT = 3;

async function checkAndIncrement(ipAddress) {
  // Development bypass
  if (isLocalhost(ipAddress)) return { allowed: true };

  // Use the current date as the key (YYYY‑MM‑DD)
  const dateKey = new Date().toISOString().slice(0, 10);
  let record = await IpUsage.findOne({ ipAddress, dateKey });

  // Create a new record if none exists for today
  if (!record) {
    record = await IpUsage.create({ ipAddress, dateKey, analysisCount: 0 });
  }

  // Deny if the daily limit has been reached
  if (record.analysisCount >= DAILY_LIMIT) {
    return { allowed: false };
  }

  // Increment the count and persist
  record.analysisCount += 1;
  await record.save();

  return { allowed: true };
}

Key Components

  • Model: IpUsage { ipAddress, dateKey, analysisCount }
  • Frontend: Calls GET /api/analysis/limit-status to display “X of 3 analyses remaining today”.
  • Localhost: Exempt from rate limiting for development.

Prompt & Response Schema

We send a structured system prompt so the model always returns JSON in a predictable shape.

{
  "overallSignal": "Interested" | "Hesitant" | "Mixed Signals" | "Low Interest" | "Neutral",
  "intentSummary": "Short paragraph...",
  "toneAnalysis": [
    "observation 1",
    "observation 2",
    "..."
  ],
  "riskLevel": "Low" | "Medium" | "High",
  "riskExplanation": "Brief explanation",
  "suggestedReplies": {
    "safe": "A cautious reply",
    "confident": "A more direct reply",
    "warm": "A warm, casual reply"
  },
  "disclaimer": "This is a behavioral interpretation, not certainty."
}
  • The server validates the JSON, rejecting responses that are missing any required keys.
  • Token usage and estimated cost are stored in MongoDB for internal monitoring only.

OCR Service

// ocr.service.js
const { createWorker } = require('tesseract.js');

async function extractText(imagePath) {
  const worker = await createWorker('eng');
  try {
    const result = await worker.recognize(imagePath);
    return {
      text: (result.data?.text || '').trim(),
      confidence: result.data?.confidence,
    };
  } finally {
    await worker.terminate();
  }
}
  • Multer handles the multipart upload; the temporary file path is passed to Tesseract.
  • If no text is found, the API returns a friendly error: “No text found in image. Try a clearer screenshot.”

Cloudflare R2 Storage

// r2.service.js
const { S3Client } = require('@aws-sdk/client-s3');

const client = new S3Client({
  region: 'auto',
  endpoint: `https://${accountId}.r2.cloudflarestorage.com`,
  credentials: { accessKeyId, secretAccessKey },
  forcePathStyle: true,
});
  • Files are stored under the chat-analyzer/ prefix.
  • When R2 environment variables are missing, the service falls back to local disk – handy for local development.

UI Details

  • Unified Input – Paste (Ctrl + V) detects images vs. text; the upload button accepts PNG, JPEG, or WebP files.
  • Preview – When an image is pasted or uploaded, a preview appears and the textarea is hidden to reduce clutter.
  • Loading State – During analysis, a semi‑transparent overlay with a spinner covers the input box without causing layout shift.
  • Example Card – A sticky “Example Analysis” card on desktop (positioned below the form on mobile) shows sample output (signal, intent, suggested reply).

Blog Implementation

  • Content

    • Markdown files stored in content/posts/
    • Front‑matter fields: title, description, date, tags, category
  • Rendering

    • Uses react-markdown with remark-gfm for GitHub‑flavored markdown
  • Categories

    • Messaging Psychology
    • WhatsApp Behavior
    • Texting Etiquette
    • Relationship Signals
    • Communication Boundaries
  • SEO

    • Page metadata (title & description) generated for each route
    • SoftwareApplication schema for the analyzer
    • BlogPosting schema for individual articles
  • RSS

    • Generated at build time
    • Includes only title, link, and description to limit full‑content scraping

Deployment

  • Frontend – Deployed on Vercel or Cloudflare Pages (static generation for the blog).
  • Backend – Hosted on a Node.js server (e.g., Render, Fly.io, or a managed VPS) with the Express API and MongoDB instance.

TL;DR

  • Paste or upload → OCR (if needed) → OpenAI analysis → JSON response.
  • No auth, just IP‑based rate limiting (3 requests per day).
  • Tech: Next.js 14 + Tailwind, Express, MongoDB, OpenAI, Tesseract.js, Cloudflare R2.
  • Blog: Markdown‑driven, static‑site generation, SEO‑friendly.

Project Overview

WatchContact – Paste a conversation or screenshot to obtain:

  • Intent
  • Tone
  • Risk level
  • Suggested replies

No signup required. 3 free analyses per day.


## Architecture

| Layer      | Technology / Service                                   |
|------------|----------------------------------------------------------|
| **Frontend** | Next.js                                                |
| **Backend**  | Node.js (any host – Railway, Render, Fly.io, etc.)     |
| **Database** | MongoDB Atlas                                          |
| **Storage**  | Cloudflare R2 (S3‑compatible) for screenshots          |
| **AI**       | OpenAI (structured JSON prompts with validation)      |
| **OCR**      | Tesseract.js (worker init can be slow)                |

Environment Variables

VariableDescription
OPENAI_API_KEYOpenAI authentication token
MONGODB_URIMongoDB Atlas connection string
R2_ACCESS_KEY_ID(optional) Cloudflare R2 access key ID
R2_SECRET_ACCESS_KEY(optional) Cloudflare R2 secret access key
NEXT_PUBLIC_API_URLFront‑end URL used to call the back‑end API

Key Considerations

  • IP Rate Limiting – Free tool without authentication; limit to 3 requests per day per IP to curb abuse while still allowing genuine use.
  • Tesseract.js Warm‑up – The first run is slow due to worker initialization. In production, consider pre‑warming the worker or queuing OCR jobs.
  • Structured JSON Prompts – Validate AI output to ensure reliable, easy‑to‑render data.
  • R2 as S3‑compatible Storage – The AWS SDK works with minimal configuration, keeping storage handling simple.
  • SEO Boost – Combining the analysis tool with a blog provides ongoing value, encouraging repeat visits beyond a one‑off analysis.

Tech Stack Summary

LayerTechnology
FrontendNext.js
BackendExpress (Node.js)
AIOpenAI API (structured JSON)
OCRTesseract.js
DatabaseMongoDB Atlas
File StorageCloudflare R2 (S3‑compatible)

All components are interchangeable; the backend can be hosted on any platform that supports Node.js, and the storage layer can be swapped for any S3‑compatible service if desired.

0 views
Back to Blog

Related posts

Read more »