The mental checklist I use when troubleshooting Linux servers

Published: (December 20, 2025 at 02:09 PM EST)
1 min read
Source: Dev.to

Source: Dev.to

Step 1: What is broken?

  • Service not running?
  • Server unreachable?
  • Performance issue?
  • Permission issue?
  • Always define failure first

Step 2: Is the system alive?

  • Can I SSH in?
  • Is the server responsive?
  • Is the disk full?
  • Is RAM exhausted?

Step 3: Is the service running?

  • Is the process running?
  • Did it fail to start?
  • Did it crash?
  • This eliminates 50 % of issues

Step 4: Check logs

  • Why it failed
  • What it tried to do
  • What it couldn’t access
  • Learn to scan logs, not read every line

Step 5: What changed last?

  • Updates
  • Config edits
  • Permission changes
  • New files
  • Always ask: what changed?

Step 6: Narrow scope

  • Is it one user or all users?
  • One service or the whole system?
  • One port or all networking?
  • This prevents panic

Step 7: Test ONE thing at a time

  • Make a small change
  • Restart service
  • Observe
  • Never shotgun‑fix

Step 8: Confirm + document

  • Is it fixed?
  • Why?
  • What would I do faster next time?
  • That’s real troubleshooting
Back to Blog

Related posts

Read more »