I Built a Multi-Agent Job Search System with Claude Code — 631 Evaluations, 12 Modes
Source: Dev.to
What I Built
A multi‑agent system with 12 operational modes, each a Claude Code skill file with its own context and rules. Not a single script – an agent that reasons about the problem domain.
Key architectural choice: modes over one long prompt.
career-ops/
├── modes/
│ ├── _shared.md # North Star archetypes, proof points
│ ├── auto-pipeline.md # Full pipeline: JD → eval → PDF → tracker
│ ├── oferta.md # Single‑offer evaluation (A‑F)
│ ├── batch.md # Parallel processing with workers
│ ├── pdf.md # ATS‑optimized CV per offer
│ ├── scan.md # Portal discovery
│ ├── apply.md # Playwright form‑filling
│ └── ... (12 total)
├── reports/ # 631 evaluation files
├── output/ # Generated PDFs
├── applications.md # Central tracker
└── scan-history.tsv # 680 deduplicated URLsWhy modes? Each mode loads only the context it needs. auto‑pipeline skips contact rules, apply skips scoring logic. Less context → better decisions from the LLM.
The 10‑Dimension Scoring
Every offer runs through a weighted evaluation framework.
| Dimension | What It Measures | Weight |
|---|---|---|
| Role Match | Alignment with CV proof points | Gate‑pass |
| Skills Alignment | Tech stack overlap | Gate‑pass |
| Seniority | Stretch level | High |
| Compensation | Market rate vs target | High |
| Geographic | Remote/hybrid feasibility | Medium |
| Company Stage | Startup/growth/enterprise fit | Medium |
| Product‑Market Fit | Problem domain resonance | Medium |
| Growth Trajectory | Career ladder visibility | Medium |
| Interview Likelihood | Callback probability | High |
| Timeline | Hiring urgency | Low |
Role Match and Skills Alignment are gate‑pass dimensions – if they fail, the final score drops regardless of the other scores. 74 % of evaluated offers scored below 4.0.
The Pipeline
auto‑pipeline is the flagship mode. A URL goes in, and out comes:
- Extract JD – Playwright navigates to the URL, extracts structured content.
- Evaluate 10D – Claude reads JD + CV + portfolio, generates scoring.
- Generate report – Markdown with six blocks: summary, CV match, level, compensation, personalization, interview probability.
- Generate PDF – HTML template + keyword injection + Puppeteer render.
- Register tracker – TSV auto‑merge via a Node.js script.
- Dedup – Checks 680 URLs in
scan-history.tsv. Zero re‑evaluations.
Batch Processing
For high volume, batch mode launches a conductor that orchestrates parallel workers.
# conductor spawns N workers, each an independent Claude Code process
./batch-runner.sh --input batch/batch-input.tsv --workers 4
# Each worker:
# 1. Claims a URL from the queue (lock file prevents doubles)
# 2. Runs auto-pipeline
# 3. Writes result to batch-state.tsv
# 4. Picks next URL- 122 URLs processed in parallel.
- Fault‑tolerant: a worker failure never blocks the rest.
- Resumable: reads state and skips completed items.
The AI Resume Builder
A generic PDF loses. Career‑Ops generates a different ATS‑optimized CV for each offer:
- Extract 15‑20 keywords from the JD.
- Detect language (e.g., English JD → English CV).
- Detect region (US → Letter, Europe → A4).
- Detect archetype (6 predefined: AI Platform, Agentic, PM, SA, FDE, Transformation).
- Select top 3‑4 projects by relevance.
- Reorder bullets – most relevant experience moves up.
- Render PDF – Puppeteer, self‑hosted fonts, single‑column ATS‑safe.
Same CV, six different framings. All real – keywords are reformulated, never fabricated.
Results
Two months in production (real numbers, not demos):
- 631 reports generated
- 68 applications sent
- 354 PDFs generated
- 680 URLs deduplicated
- 0 re‑evaluations
What I Learned
- Automate analysis, not decisions. Career‑Ops evaluates 631 offers; I decide which ones get my time. Human‑in‑the‑loop is a design feature, not a limitation.
- Modes beat a long prompt. Twelve focused modes outperform a single 10 k‑token system prompt. My early attempt with one massive prompt produced terrible quality.
- Deduplication is more valuable than scoring. 680 deduplicated URLs saved 680 unnecessary evaluations – boring infrastructure with the highest ROI.
- A CV is an argument, not a document. Tailoring proof points and framing to the archetype converts far better than a one‑size‑fits‑all PDF.
- The system is the portfolio. Building a multi‑agent job‑search system is a direct proof of competence for multi‑agent roles.
Stack
- Claude Code – LLM agent: reasoning, evaluation, content generation
- Playwright – Browser automation: portal scanning and form‑filling
- Puppeteer – PDF rendering from HTML templates
- Node.js – Utility scripts: merge‑tracker, cv‑sync‑check
- tmux – Parallel sessions: conductor + workers in batch