I Built an Open-Source CLI That Diagnoses Production Incidents in 30 Seconds — Looking for Contributors

Published: 1 hour ago (March 9, 2026 at 04:58 PM EDT)

4 min read

Source: Dev.to

Every engineer who’s been on‑call knows the drill.
It’s 3 AM. PagerDuty goes off. You open your laptop, squint at CloudWatch, start grepping through thousands of log lines, flip over to GitHub to check if anyone deployed recently, then paste everything into an AI chat hoping it can make sense of the mess.

45 minutes later, you find it: someone changed the Redis connection pool from 50 to 5.

I got tired of doing this manually, so I built AUTOPSY — an open‑source Python CLI that does the entire investigation in one command.

pip install autopsy-cli
autopsy diagnose

It pulls the last 30 minutes of error logs from AWS CloudWatch, fetches recent commits and diffs from GitHub, sends everything to an LLM (Claude or GPT‑4o), and prints a structured root‑cause analysis directly in your terminal.

The whole thing runs locally. No agents, no platform, no servers. Your logs go from AWS directly to the AI provider using your credentials—nothing touches our infrastructure.

Architecture

CLI (Click)
 └── DiagnosisOrchestrator
       ├── CloudWatchCollector  →  AWS Logs Insights (boto3)
       ├── GitHubCollector      →  Commits + diffs (PyGitHub)
       └── AIEngine             →  Anthropic / OpenAI
               └── TerminalRenderer (Rich)

A four‑stage log‑reduction pipeline compresses raw CloudWatch output to fit LLM context windows:

Regex filtering
SHA256 deduplication
Truncation
Hard 6 000‑token budget

The AI response is validated against a Pydantic schema, with automatic retry on malformed output. Every collector implements a BaseCollector interface, so adding new data sources (Datadog, ELK, GCP) is a single new class.

Tech Stack

Python 3.10–3.13
Click, boto3, PyGitHub, Rich, questionary
Pydantic v2
Anthropic + OpenAI SDKs

AUTOPSY is live on PyPI with 149 passing tests and full CI/CD (GitHub Actions → PyPI via OIDC).

Open Issues & Contribution Opportunities

I’ve created 17 open issues across three difficulty levels. They are scoped, well‑documented, and perfect for a first open‑source PR.

Issue	Description
`--version` output	Show Python version, OS, prompt version
Configurable log severity filter	Let users control which log levels get pulled
`CONTRIBUTING.md`	Help future contributors get started
PR template	Standardize pull requests
Improved AWS credential error messages	Better error UX
Datadog Logs collector	For teams not on CloudWatch
GitLab collector	For non‑GitHub users
Demo mode	Show AUTOPSY without any credentials
Diagnosis history (SQLite)	Persist past diagnoses locally
Slack notification	Post results to an incident channel
Parallel collector execution	Speed up multi‑log‑group queries with asyncio
ELK / OpenSearch collector
Ollama support	Fully local LLM for teams that can’t send logs to cloud providers
Prompt evaluation harness	Automated accuracy testing against known incidents
GCP Cloud Logging collector
Auto‑generated post‑mortem documents

Each issue includes clear acceptance criteria, implementation hints, and links to the relevant source files.

Development Setup

# Clone and install in dev mode
git clone https://github.com/zaappy/autopsy.git
cd autopsy
pip install -e ".[dev]"

Run tests

pytest

Run linting

ruff check .

The codebase is clean, strictly linted (ruff, 7 rule sets), type‑checked (mypy strict mode), and every module has test coverage.

Why This Matters

AUTOPSY targets the one phase of the incident lifecycle that nobody owns: diagnosis.

Detection → solved (Datadog, PagerDuty, Grafana)
Response coordination → solved (Rootly, incident.io)
Diagnosis → still manual grep and intuition at most companies

Funded players in this space (Ciroos at $21 M, incident.io at $28 M+) are building expensive enterprise platforms. Nobody is building a simple, free tool that an individual engineer can install in 30 seconds. That’s the gap.

The CLI will always be free and open‑source. A paid team layer (AUTOPSY Cloud) is on the roadmap for teams that need persistent history, shared dashboards, and Slack integration.

I Built an Open-Source CLI That Diagnoses Production Incidents in 30 Seconds — Looking for Contributors

Architecture

Tech Stack

Open Issues & Contribution Opportunities

Development Setup

Run tests

Run linting

Why This Matters

Links

Related posts

How to Build a Custom MCP Tool in Under 10 Min

Your AI Agent Is Dumpster Diving Through Your Code,,,

Watson's Contract Problem: What AI Teaches Us About Tech Debt

I Scanned 100 Vibe-Coded Apps for Security. I Found 318 Vulnerabilities.