코골이를 무시하지 마세요: OpenAI Whisper와 Librosa를 활용한 로컬 수면 무호흡 감지기 구축

발행: 3개월 전 (2026년 1월 18일 오전 09:50 GMT+9)

5 분 소요

원문: Dev.to

Source: Dev.to

소개

수면은 몸이 재충전되는 시간이어야 하지만, 수백만 명에게는 숨을 쉬기 위한 전쟁입니다. 수면 무호흡증은 조용히 사람을 죽이는 질환으로, 임상 수면 검사가 비용이 많이 들고 침습적이기 때문에 진단되지 않은 경우가 많습니다. 오디오 신호 처리와 OpenAI Whisper를 사용해 로컬에서 호흡 패턴을 모니터링하고 완전한 프라이버시를 보장한다면 어떨까요?

개요

이 튜토리얼은 수면 무호흡증 감지를 위한 하이브리드 접근 방식을 보여줍니다:

음향 트랙 – Librosa가 빠른 푸리에 변환(FFT)을 수행해 “정적”과 “코 고는 소리” 주파수를 구분합니다.
시맨틱 트랙 – Whisper의 인코더가 고수준 오디오 특징을 추출해 숨가쁨, 질식, 배경 소음을 구별합니다.

두 트랙은 이벤트 감지 엔진에 입력되어 로컬에 저장되는 무호흡‑저호흡 지수(AHI) 보고서를 생성합니다.

graph TD
    A[Raw Audio Input .wav/.mp3] --> B{FFmpeg Processing}
    B --> C[Segmented Audio Chunks]
    C --> D[Librosa: Spectral Analysis]
    C --> E[Whisper: Feature Extraction]
    D --> F[Frequency & Amplitude Thresholding]
    E --> G[Acoustic Pattern Recognition]
    F --> H[Event Detection Engine]
    G --> H[Event Detection Engine]
    H --> I[Apnea‑Hypopnea Index - AHI Report]
    I --> J[Local Storage / Privacy First]

사전 요구 사항

Python 3.9+
OpenAI Whisper (오디오 특징 추출)
Librosa (오디오 분석)
PyTorch (Whisper 모델 실행)
FFmpeg (오디오 디코딩)

pip install openai-whisper librosa torch matplotlib ffmpeg-python

에너지 기반 호흡 분석

먼저 단시간 푸리에 변환(STFT)을 추출해 에너지 분포를 살펴봅니다. 코 고는 소리는 보통 60 Hz – 2000 Hz 범위에 존재하고, 무호흡 이벤트는 갑작스러운 에너지 감소로 나타납니다.

import librosa
import numpy as np

def analyze_breathing_energy(audio_path):
    # Load audio (downsampled to 16 kHz for Whisper compatibility)
    y, sr = librosa.load(audio_path, sr=16000)

    # Compute STFT magnitude and RMS energy
    stft = np.abs(librosa.stft(y))
    energy = librosa.feature.rms(y=y)

    # Detect “silent” patches longer than 10 s (potential apnea)
    threshold = 0.01  # Adjust based on ambient noise
    # silent_frames = energy  # (logic to identify frames below threshold for >10 s)
    return y, sr, energy

def detect_apnea_events(audio_path):
    y, sr, silent_frames = analyze_breathing_energy(audio_path)

    events = []
    # Simplified sliding‑window logic
    for i in range(0, len(silent_frames[0]), 100):
        if np.all(silent_frames[0][i : i + 50]):  # Potential apnea duration
            start_time = librosa.frames_to_time(i, sr=sr)
            events.append(f"Apnea warning at {start_time:.2f} seconds")

    return events

print(detect_apnea_events("sleep_record.wav"))

호흡 패턴 시각화

Matplotlib(및 Librosa의 display 유틸리티)을 사용해 시간에 따른 진폭을 플롯하면 평탄한 구간과 보상 스파이크를 쉽게 찾을 수 있습니다.

import matplotlib.pyplot as plt
import librosa.display

def plot_breathing(y, sr):
    plt.figure(figsize=(12, 4))
    librosa.display.waveshow(y, sr=sr, alpha=0.5)
    plt.title("Nocturnal Breathing Pattern")
    plt.xlabel("Time (s)")
    plt.ylabel("Amplitude")
    plt.show()

다음 단계

Whisper 임베딩 위에 코 고는 소리 분류기(예: Random Forest)를 학습시킵니다.
파이프라인을 모바일 앱(Flutter, React Native)과 통합해 실시간 침대 옆 알림을 제공합니다.
고동시성 오디오 스트리밍 및 의료‑LLM 요약을 WellAlly Tech Blog에서 소개된 대로 탐구합니다.

면책 조항

이 프로젝트는 교육 목적에 한정된 것이며 전문적인 의료 조언을 대체하지 않습니다. 수면 관련 건강 문제는 반드시 자격을 갖춘 의료 제공자와 상담하십시오.

코골이를 무시하지 마세요: OpenAI Whisper와 Librosa를 활용한 로컬 수면 무호흡 감지기 구축

소개

개요

사전 요구 사항

에너지 기반 호흡 분석

호흡 패턴 시각화

다음 단계

면책 조항

관련 글

프로덕션‑레디 멀티‑리전 AWS 아키텍처(EKS) 설계 | CI/CD | 카나리 배포 | DR 페일오버

AWS Bedrock Knowledge Base를 활용한 AI-enabled Slackbot 만들기

나는 Claude Code, Cursor, Codex CLI, 그리고 Gemini CLI를 배우기 위한 포괄적인 FREE 교육 플랫폼을 만들었습니다

에이전트 디버깅은 어렵다: AI Kernel을 위한 'Flight Recorder'를 구축한 방법