别再忽视你的打鼾：使用 OpenAI Whisper 和 Librosa 构建本地睡眠呼吸暂停检测器

发布: 2天前 (2026年1月18日 GMT+8 08:50)

4 min read

Source: Dev.to

介绍

睡眠本应是我们身体充电的时段，但对数百万人来说，这是一场争夺空气的战斗。睡眠呼吸暂停是一种潜伏的杀手，常常因为临床睡眠研究费用高昂且侵入性强而未被诊断。如果可以使用音频信号处理和 OpenAI Whisper 在本地监测呼吸模式，并确保完全隐私，该怎么办？

概览

本教程演示了一种混合式的睡眠呼吸暂停检测方法：

声学轨道 – Librosa 执行快速傅里叶变换（FFT）以识别“静音”与“打鼾”频率。
语义轨道 – Whisper 的编码器提取高级音频特征，以区分喘息、呛咳和背景噪音。

这两条轨道共同输入到事件检测引擎，生成本地存储的呼吸暂停-低通气指数（AHI）报告。

graph TD
    A[Raw Audio Input .wav/.mp3] --> B{FFmpeg Processing}
    B --> C[Segmented Audio Chunks]
    C --> D[Librosa: Spectral Analysis]
    C --> E[Whisper: Feature Extraction]
    D --> F[Frequency & Amplitude Thresholding]
    E --> G[Acoustic Pattern Recognition]
    F --> H[Event Detection Engine]
    G --> H[Event Detection Engine]
    H --> I[Apnea‑Hypopnea Index - AHI Report]
    I --> J[Local Storage / Privacy First]

前置条件

Python 3.9+
OpenAI Whisper（音频特征提取）
Librosa（音频分析）
PyTorch（运行 Whisper 模型）
FFmpeg（音频解码）

pip install openai-whisper librosa torch matplotlib ffmpeg-python

基于能量的呼吸分析

首先提取短时傅里叶变换（STFT）以检查能量分布。打鼾通常出现在 60 Hz – 2000 Hz 区间，而呼吸暂停事件表现为能量的突发下降。

import librosa
import numpy as np

def analyze_breathing_energy(audio_path):
    # Load audio (downsampled to 16 kHz for Whisper compatibility)
    y, sr = librosa.load(audio_path, sr=16000)

    # Compute STFT magnitude and RMS energy
    stft = np.abs(librosa.stft(y))
    energy = librosa.feature.rms(y=y)

    # Detect “silent” patches longer than 10 s (potential apnea)
    threshold = 0.01  # Adjust based on ambient noise
    # silent_frames = energy  # (logic to identify frames below threshold for >10 s)
    return y, sr, energy

def detect_apnea_events(audio_path):
    y, sr, silent_frames = analyze_breathing_energy(audio_path)

    events = []
    # Simplified sliding‑window logic
    for i in range(0, len(silent_frames[0]), 100):
        if np.all(silent_frames[0][i : i + 50]):  # Potential apnea duration
            start_time = librosa.frames_to_time(i, sr=sr)
            events.append(f"Apnea warning at {start_time:.2f} seconds")

    return events

print(detect_apnea_events("sleep_record.wav"))

可视化呼吸模式

使用 Matplotlib（以及 Librosa 的显示工具）可以绘制随时间变化的振幅，以发现平坦线和补偿性峰值。

import matplotlib.pyplot as plt
import librosa.display

def plot_breathing(y, sr):
    plt.figure(figsize=(12, 4))
    librosa.display.waveshow(y, sr=sr, alpha=0.5)
    plt.title("Nocturnal Breathing Pattern")
    plt.xlabel("Time (s)")
    plt.ylabel("Amplitude")
    plt.show()

下一步

在 Whisper 嵌入上训练一个 打鼾分类器（例如随机森林）。
将流水线与移动应用（Flutter、React Native）集成，实现实时床边提醒。
探索高并发音频流和医学 LLM 摘要技术，详见 WellAlly Tech Blog。

免责声明

本项目仅用于教育目的，不能替代专业医疗建议。对于任何与睡眠相关的健康问题，请务必咨询合格的医疗保健提供者。

别再忽视你的打鼾：使用 OpenAI Whisper 和 Librosa 构建本地睡眠呼吸暂停检测器

介绍

概览

前置条件

基于能量的呼吸分析

可视化呼吸模式

下一步

免责声明

相关文章

Rapg：基于 TUI 的密钥管理器

技术是赋能者，而非救世主

行业调查：编码更快，调试更慢

踏入 agentic coding