나는 사고 조사용 AI 도구를 만들었습니다 (솔직한 피드백을 원합니다)

발행: (2026년 3월 27일 PM 09:42 GMT+9)
3 분 소요
원문: Dev.to

Source: Dev.to

Introduction

Hey everyone 👋

Over the past couple of weeks, I’ve been building a side project called Opsrift.

It started from a simple frustration: postmortems, handovers, and incident documentation take way too much time — and most of it is repetitive. While building it, I realized the real problem isn’t just writing postmortems; it’s understanding what actually happened during an incident.

What Opsrift does right now

The platform focuses on incident workflows, primarily for SRE, support, or operations teams. Current features include:

Postmortem generator

Takes incident data and generates structured postmortems in seconds.

Handover generator

Useful for shift‑based teams — turns messy updates into clean handovers.

Runbook generator

Creates structured runbooks based on incident patterns or inputs.

Incident Investigator (main focus)

  • Pulls data from tools like Jira, PagerDuty, and Opsgenie
  • Correlates it with deployments from GitHub
  • Attempts to reconstruct what actually happened (timeline, possible causes, etc.)

The goal is to reduce the time spent jumping between tools during investigations.

Status page

Basic external communication for incidents.

Integrations

Current integrations (still early; some are rough):

  • Jira
  • PagerDuty
  • Opsgenie
  • GitHub
  • Slack
  • Confluence

What it’s NOT (yet)

  • Not a replacement for your incident‑management tools
  • Not perfect at root‑cause analysis
  • Not “production‑grade” in every edge case

Right now it’s closer to an AI layer on top of your existing tools to speed up investigation and documentation.

Known issues

  • GitHub login ❌ (bugged)
  • Slack login ❌ (bugged)

You can still use:

  • Google login
  • Email/password signup

Fixes are in progress.

What I’m trying to figure out

I’d really appreciate help validating a few things:

  • Does the Incident Investigator actually help, or is it just “nice to have”?
  • Are the outputs accurate enough to be trusted?
  • Would you use something like this in real workflows?
  • What’s missing for it to be genuinely useful?

Where I want to take this

Long‑term ideas include moving beyond generating outputs to:

  • Detecting patterns across incidents
  • Identifying unstable services
  • Highlighting teams with high escalation rates
  • Correlating deployments with incidents automatically

In short: turning incident data into actionable insights.

If you want to try it

👉

No pressure — even quick feedback is super helpful.

Final note

I’ve worked in NOC/SOC and incident‑heavy environments, so this is a “scratch my own itch” project. I’m aware tools like this can become:

  • Too generic
  • Inaccurate
  • Just another dashboard nobody uses

I’d rather get honest feedback early, even if it’s:

“this doesn’t solve anything for me”

That’s useful.

Thanks in advance 🙌

0 조회
Back to Blog

관련 글

더 보기 »