Stop Ignoring Your Snore: Building a Local Sleep Apnea Detector with OpenAI Whisper and Librosa

Published: (January 17, 2026 at 07:50 PM EST)
3 min read
Source: Dev.to

Source: Dev.to

Introduction

Sleep is supposed to be the time when our bodies recharge, but for millions it’s a battle for air. Sleep apnea is a silent killer, often going undiagnosed because clinical sleep studies are expensive and intrusive. What if you could use audio‑signal processing and OpenAI Whisper to monitor breathing patterns locally, ensuring total privacy?

Overview

This tutorial demonstrates a hybrid approach to sleep‑apnea detection:

  • Acoustic track – Librosa performs Fast Fourier Transform (FFT) to identify “silence” vs. “snoring” frequencies.
  • Semantic track – Whisper’s encoder extracts high‑level audio features to distinguish gasps, chokes, and background noise.

The two tracks feed into an event‑detection engine that produces an Apnea‑Hypopnea Index (AHI) report stored locally.

graph TD
    A[Raw Audio Input .wav/.mp3] --> B{FFmpeg Processing}
    B --> C[Segmented Audio Chunks]
    C --> D[Librosa: Spectral Analysis]
    C --> E[Whisper: Feature Extraction]
    D --> F[Frequency & Amplitude Thresholding]
    E --> G[Acoustic Pattern Recognition]
    F --> H[Event Detection Engine]
    G --> H[Event Detection Engine]
    H --> I[Apnea‑Hypopnea Index - AHI Report]
    I --> J[Local Storage / Privacy First]

Prerequisites

  • Python 3.9+
  • OpenAI Whisper (audio feature extraction)
  • Librosa (audio analysis)
  • PyTorch (run Whisper models)
  • FFmpeg (audio decoding)
pip install openai-whisper librosa torch matplotlib ffmpeg-python

Energy‑Based Breathing Analysis

First we extract the Short‑Time Fourier Transform (STFT) to examine energy distribution. Snoring typically occupies the 60 Hz – 2000 Hz range, while apnea events appear as sudden energy drops.

import librosa
import numpy as np

def analyze_breathing_energy(audio_path):
    # Load audio (downsampled to 16 kHz for Whisper compatibility)
    y, sr = librosa.load(audio_path, sr=16000)

    # Compute STFT magnitude and RMS energy
    stft = np.abs(librosa.stft(y))
    energy = librosa.feature.rms(y=y)

    # Detect “silent” patches longer than 10 s (potential apnea)
    threshold = 0.01  # Adjust based on ambient noise
    # silent_frames = energy  # (logic to identify frames below threshold for >10 s)
    return y, sr, energy
def detect_apnea_events(audio_path):
    y, sr, silent_frames = analyze_breathing_energy(audio_path)

    events = []
    # Simplified sliding‑window logic
    for i in range(0, len(silent_frames[0]), 100):
        if np.all(silent_frames[0][i : i + 50]):  # Potential apnea duration
            start_time = librosa.frames_to_time(i, sr=sr)
            events.append(f"Apnea warning at {start_time:.2f} seconds")

    return events

print(detect_apnea_events("sleep_record.wav"))

Visualizing Breathing Patterns

Using Matplotlib (and Librosa’s display utilities) we can plot amplitude over time to spot flatlines and compensatory spikes.

import matplotlib.pyplot as plt
import librosa.display

def plot_breathing(y, sr):
    plt.figure(figsize=(12, 4))
    librosa.display.waveshow(y, sr=sr, alpha=0.5)
    plt.title("Nocturnal Breathing Pattern")
    plt.xlabel("Time (s)")
    plt.ylabel("Amplitude")
    plt.show()

Next Steps

  • Train a snore classifier (e.g., Random Forest) on top of Whisper embeddings.
  • Integrate the pipeline with a mobile app (Flutter, React Native) for real‑time bedside alerts.
  • Explore high‑concurrency audio streaming and medical‑LLM summarization as described in the WellAlly Tech Blog.

Disclaimer

This project is for educational purposes only and is not a substitute for professional medical advice. Always consult a qualified healthcare provider for sleep‑related health concerns.

Back to Blog

Related posts

Read more »

Rapg: TUI-based Secret Manager

We've all been there. You join a new project, and the first thing you hear is: > 'Check the pinned message in Slack for the .env file.' Or you have several .env...

Technology is an Enabler, not a Saviour

Why clarity of thinking matters more than the tools you use Technology is often treated as a magic switch—flip it on, and everything improves. New software, pl...