I Built an Open-Source CLI That Diagnoses Production Incidents in 30 Seconds — Looking for Contributors
Source: Dev.to
Every engineer who’s been on‑call knows the drill.
It’s 3 AM. PagerDuty goes off. You open your laptop, squint at CloudWatch, start grepping through thousands of log lines, flip over to GitHub to check if anyone deployed recently, then paste everything into an AI chat hoping it can make sense of the mess.
45 minutes later, you find it: someone changed the Redis connection pool from 50 to 5.
I got tired of doing this manually, so I built AUTOPSY — an open‑source Python CLI that does the entire investigation in one command.
pip install autopsy-cli
autopsy diagnose
It pulls the last 30 minutes of error logs from AWS CloudWatch, fetches recent commits and diffs from GitHub, sends everything to an LLM (Claude or GPT‑4o), and prints a structured root‑cause analysis directly in your terminal.
The whole thing runs locally. No agents, no platform, no servers. Your logs go from AWS directly to the AI provider using your credentials—nothing touches our infrastructure.
Architecture
CLI (Click)
└── DiagnosisOrchestrator
├── CloudWatchCollector → AWS Logs Insights (boto3)
├── GitHubCollector → Commits + diffs (PyGitHub)
└── AIEngine → Anthropic / OpenAI
└── TerminalRenderer (Rich)
A four‑stage log‑reduction pipeline compresses raw CloudWatch output to fit LLM context windows:
- Regex filtering
- SHA256 deduplication
- Truncation
- Hard 6 000‑token budget
The AI response is validated against a Pydantic schema, with automatic retry on malformed output. Every collector implements a BaseCollector interface, so adding new data sources (Datadog, ELK, GCP) is a single new class.
Tech Stack
- Python 3.10–3.13
- Click, boto3, PyGitHub, Rich, questionary
- Pydantic v2
- Anthropic + OpenAI SDKs
AUTOPSY is live on PyPI with 149 passing tests and full CI/CD (GitHub Actions → PyPI via OIDC).
Open Issues & Contribution Opportunities
I’ve created 17 open issues across three difficulty levels. They are scoped, well‑documented, and perfect for a first open‑source PR.
| Issue | Description |
|---|---|
--version output | Show Python version, OS, prompt version |
| Configurable log severity filter | Let users control which log levels get pulled |
CONTRIBUTING.md | Help future contributors get started |
| PR template | Standardize pull requests |
| Improved AWS credential error messages | Better error UX |
| Datadog Logs collector | For teams not on CloudWatch |
| GitLab collector | For non‑GitHub users |
| Demo mode | Show AUTOPSY without any credentials |
| Diagnosis history (SQLite) | Persist past diagnoses locally |
| Slack notification | Post results to an incident channel |
| Parallel collector execution | Speed up multi‑log‑group queries with asyncio |
| ELK / OpenSearch collector | |
| Ollama support | Fully local LLM for teams that can’t send logs to cloud providers |
| Prompt evaluation harness | Automated accuracy testing against known incidents |
| GCP Cloud Logging collector | |
| Auto‑generated post‑mortem documents |
Each issue includes clear acceptance criteria, implementation hints, and links to the relevant source files.
Development Setup
# Clone and install in dev mode
git clone https://github.com/zaappy/autopsy.git
cd autopsy
pip install -e ".[dev]"
Run tests
pytest
Run linting
ruff check .
The codebase is clean, strictly linted (ruff, 7 rule sets), type‑checked (mypy strict mode), and every module has test coverage.
Why This Matters
AUTOPSY targets the one phase of the incident lifecycle that nobody owns: diagnosis.
- Detection → solved (Datadog, PagerDuty, Grafana)
- Response coordination → solved (Rootly, incident.io)
- Diagnosis → still manual grep and intuition at most companies
Funded players in this space (Ciroos at $21 M, incident.io at $28 M+) are building expensive enterprise platforms. Nobody is building a simple, free tool that an individual engineer can install in 30 seconds. That’s the gap.
The CLI will always be free and open‑source. A paid team layer (AUTOPSY Cloud) is on the roadmap for teams that need persistent history, shared dashboards, and Slack integration.
Links
- GitHub:
- PyPI:
- Issues:
If you’ve ever been woken up at 3 AM by a production incident, you know the pain this solves. Come build it with me.
⭐ Star the repo if this resonates, and grab an issue if you want to contribute. Every PR gets a proper review and every contributor gets credited.
Built by Zeel with help from Claude Code.