Building WatchContact: An AI Chat Analyzer with OCR, Rate Limiting, and a Content Blog
Source: Dev.to
What We Built
WatchContact is a two‑part product:
- AI Chat Analyzer – a free tool where users paste a conversation (or upload a screenshot) and receive an AI‑powered analysis that includes intent, tone, risk level, and suggested replies.
- Blog – articles covering messaging psychology, WhatsApp behavior, texting etiquette, and relationship signals.
The goal was to ship something useful quickly: no authentication, no signup—just paste and analyze.
Architecture Diagram
flowchart LR
A[Next.js 14<br/>(Frontend)] --> B[Express API<br/>(Backend)]
B --> C[MongoDB<br/>(Analytics)]
B -->|OpenAI API| D[OpenAI Service]
B -->|Tesseract.js (OCR)| E[OCR Service]
B -->|Cloudflare R2| F[Object Storage]
A -->|Markdown blog (static)| G[Static Site Generator]- Next.js 14 – Serves the React‑based frontend and generates the static Markdown blog.
- Express API – Handles application logic, routes requests to external services, and talks to MongoDB.
- MongoDB – Stores analytics and other persistent data.
- OpenAI API – Provides AI capabilities (e.g., text generation, embeddings).
- Tesseract.js (OCR) – Performs client‑side OCR when needed.
- Cloudflare R2 – Object storage for assets such as images and PDFs.
Tech Stack
| Layer | Technologies |
|---|---|
| Frontend | Next.js 14, React, Tailwind CSS |
| Backend | Express, Mongoose, MongoDB |
| AI | OpenAI GPT (conversation analysis) |
| OCR | Tesseract.js |
| Storage | Cloudflare R2 (S3‑compatible) |
| Auth | None – IP‑based rate limiting |
| Deployment | Vercel / Cloudflare Pages (static blog) |
User Flow
Text input – The user pastes or types a conversation.
- The request is sent directly to
/api/analysis/text.
- The request is sent directly to
Screenshot input – The user pastes an image (Ctrl + V) or uploads a file.
- The backend runs OCR, optionally stores the image in R2, then analyzes the extracted text.
Both modes share a single input area that automatically detects whether the clipboard contains an image or plain text.
API Endpoints
| Method | Path | Purpose |
|---|---|---|
POST | /api/analysis/text | Analyze pasted text |
POST | /api/analysis/screenshot-extract | Perform OCR and optionally upload to R2 |
POST | /api/analysis/screenshot-final | Analyze OCR‑extracted text (re‑uses rate limit) |
GET | /api/analysis/limit-status | Return remaining analyses for the day |
Rate Limiting (No Signup)
We limit each IP to 3 analyses per day.
// rateLimit.service.js – simplified
const DAILY_LIMIT = 3;
async function checkAndIncrement(ipAddress) {
// Development bypass
if (isLocalhost(ipAddress)) return { allowed: true };
// Use the current date as the key (YYYY‑MM‑DD)
const dateKey = new Date().toISOString().slice(0, 10);
let record = await IpUsage.findOne({ ipAddress, dateKey });
// Create a new record if none exists for today
if (!record) {
record = await IpUsage.create({ ipAddress, dateKey, analysisCount: 0 });
}
// Deny if the daily limit has been reached
if (record.analysisCount >= DAILY_LIMIT) {
return { allowed: false };
}
// Increment the count and persist
record.analysisCount += 1;
await record.save();
return { allowed: true };
}Key Components
- Model:
IpUsage { ipAddress, dateKey, analysisCount } - Frontend: Calls
GET /api/analysis/limit-statusto display “X of 3 analyses remaining today”. - Localhost: Exempt from rate limiting for development.
Prompt & Response Schema
We send a structured system prompt so the model always returns JSON in a predictable shape.
{
"overallSignal": "Interested" | "Hesitant" | "Mixed Signals" | "Low Interest" | "Neutral",
"intentSummary": "Short paragraph...",
"toneAnalysis": [
"observation 1",
"observation 2",
"..."
],
"riskLevel": "Low" | "Medium" | "High",
"riskExplanation": "Brief explanation",
"suggestedReplies": {
"safe": "A cautious reply",
"confident": "A more direct reply",
"warm": "A warm, casual reply"
},
"disclaimer": "This is a behavioral interpretation, not certainty."
}- The server validates the JSON, rejecting responses that are missing any required keys.
- Token usage and estimated cost are stored in MongoDB for internal monitoring only.
OCR Service
// ocr.service.js
const { createWorker } = require('tesseract.js');
async function extractText(imagePath) {
const worker = await createWorker('eng');
try {
const result = await worker.recognize(imagePath);
return {
text: (result.data?.text || '').trim(),
confidence: result.data?.confidence,
};
} finally {
await worker.terminate();
}
}- Multer handles the multipart upload; the temporary file path is passed to Tesseract.
- If no text is found, the API returns a friendly error: “No text found in image. Try a clearer screenshot.”
Cloudflare R2 Storage
// r2.service.js
const { S3Client } = require('@aws-sdk/client-s3');
const client = new S3Client({
region: 'auto',
endpoint: `https://${accountId}.r2.cloudflarestorage.com`,
credentials: { accessKeyId, secretAccessKey },
forcePathStyle: true,
});- Files are stored under the
chat-analyzer/prefix. - When R2 environment variables are missing, the service falls back to local disk – handy for local development.
UI Details
- Unified Input – Paste (
Ctrl + V) detects images vs. text; the upload button accepts PNG, JPEG, or WebP files. - Preview – When an image is pasted or uploaded, a preview appears and the textarea is hidden to reduce clutter.
- Loading State – During analysis, a semi‑transparent overlay with a spinner covers the input box without causing layout shift.
- Example Card – A sticky “Example Analysis” card on desktop (positioned below the form on mobile) shows sample output (signal, intent, suggested reply).
Blog Implementation
Content
- Markdown files stored in
content/posts/ - Front‑matter fields:
title,description,date,tags,category
- Markdown files stored in
Rendering
- Uses
react-markdownwithremark-gfmfor GitHub‑flavored markdown
- Uses
Categories
- Messaging Psychology
- WhatsApp Behavior
- Texting Etiquette
- Relationship Signals
- Communication Boundaries
SEO
- Page metadata (title & description) generated for each route
SoftwareApplicationschema for the analyzerBlogPostingschema for individual articles
RSS
- Generated at build time
- Includes only
title,link, anddescriptionto limit full‑content scraping
Deployment
- Frontend – Deployed on Vercel or Cloudflare Pages (static generation for the blog).
- Backend – Hosted on a Node.js server (e.g., Render, Fly.io, or a managed VPS) with the Express API and MongoDB instance.
TL;DR
- Paste or upload → OCR (if needed) → OpenAI analysis → JSON response.
- No auth, just IP‑based rate limiting (3 requests per day).
- Tech: Next.js 14 + Tailwind, Express, MongoDB, OpenAI, Tesseract.js, Cloudflare R2.
- Blog: Markdown‑driven, static‑site generation, SEO‑friendly.
Project Overview
WatchContact – Paste a conversation or screenshot to obtain:
- Intent
- Tone
- Risk level
- Suggested replies
No signup required. 3 free analyses per day.
## Architecture
| Layer | Technology / Service |
|------------|----------------------------------------------------------|
| **Frontend** | Next.js |
| **Backend** | Node.js (any host – Railway, Render, Fly.io, etc.) |
| **Database** | MongoDB Atlas |
| **Storage** | Cloudflare R2 (S3‑compatible) for screenshots |
| **AI** | OpenAI (structured JSON prompts with validation) |
| **OCR** | Tesseract.js (worker init can be slow) |Environment Variables
| Variable | Description |
|---|---|
OPENAI_API_KEY | OpenAI authentication token |
MONGODB_URI | MongoDB Atlas connection string |
R2_ACCESS_KEY_ID | (optional) Cloudflare R2 access key ID |
R2_SECRET_ACCESS_KEY | (optional) Cloudflare R2 secret access key |
NEXT_PUBLIC_API_URL | Front‑end URL used to call the back‑end API |
Key Considerations
- IP Rate Limiting – Free tool without authentication; limit to 3 requests per day per IP to curb abuse while still allowing genuine use.
- Tesseract.js Warm‑up – The first run is slow due to worker initialization. In production, consider pre‑warming the worker or queuing OCR jobs.
- Structured JSON Prompts – Validate AI output to ensure reliable, easy‑to‑render data.
- R2 as S3‑compatible Storage – The AWS SDK works with minimal configuration, keeping storage handling simple.
- SEO Boost – Combining the analysis tool with a blog provides ongoing value, encouraging repeat visits beyond a one‑off analysis.
Tech Stack Summary
| Layer | Technology |
|---|---|
| Frontend | Next.js |
| Backend | Express (Node.js) |
| AI | OpenAI API (structured JSON) |
| OCR | Tesseract.js |
| Database | MongoDB Atlas |
| File Storage | Cloudflare R2 (S3‑compatible) |
All components are interchangeable; the backend can be hosted on any platform that supports Node.js, and the storage layer can be swapped for any S3‑compatible service if desired.