Detcting Burnout Before It Hits: Building an HRV Anomaly Detector with Isolation Forest 🚀
Source: Dev.to
Have you ever woken up feeling like a truck hit you, even though you “rested”? Or maybe you smashed a PR in the gym only to be sidelined by a cold 24 hours later? Our bodies often send distress signals long before we feel the symptoms. One of the most powerful signals is Heart Rate Variability (HRV).
In this tutorial we’ll build a predictive health pipeline using HRV anomaly detection, Isolation Forest, and Python. By leveraging unsupervised learning we can identify “outlier” days that signify early overtraining or an on‑coming infection. If you want to master wearable data analysis and scikit‑learn, you’re in the right place. 🥑
The Science: Why HRV?
HRV measures the variation in time between each heartbeat. A high HRV usually indicates a well‑recovered nervous system, while a sudden drop (or an unusually high spike) often precedes physical “crashes.” Using a standard threshold isn’t enough because everyone’s “normal” is different. That’s where Isolation Forest shines—it doesn’t need labeled data to know when your body is acting “weird.”
The Architecture 🏗️
graph TD
A[Wearable Device / Apple Health] -->|Export| B(InfluxDB)
B --> C{Python Analytics Engine}
C --> D[Scikit‑learn: Isolation Forest]
D -->|Identify Outliers| E[Grafana Dashboard]
E -->|Alert| F[User: Take a Rest Day!]
style D fill:#f9f,stroke:#333,stroke-width:2px
Prerequisites
- Python 3.9+
- scikit‑learn (for the ML magic)
- InfluxDB (optimized for time‑series wearable data)
- Grafana (for the dashboards)
Step 1: Pulling Data from InfluxDB 📥
import pandas as pd
from influxdb_client import InfluxDBClient
# Setup connection
token = "YOUR_INFLUX_TOKEN"
org = "YourOrg"
bucket = "HealthData"
client = InfluxDBClient(url="http://localhost:8086", token=token, org=org)
def fetch_hrv_data():
query = f'''
from(bucket: "{bucket}")
|> range(start: -30d)
|> filter(fn: (r) => r["_measurement"] == "heart_rate_variability")
|> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
'''
df = client.query_api().query_data_frame(query)
return df
# Assume df has columns: ['_time', 'hrv_ms', 'sleep_duration_hr']
df = fetch_hrv_data()
Step 2: Detecting Anomalies with Isolation Forest 🌲
from sklearn.ensemble import IsolationForest
import numpy as np
def detect_overtraining(df):
# Focus on HRV and Sleep Duration as primary features
features = df[['hrv_ms', 'sleep_duration_hr']]
# contamination=0.05 → expect ~5 % of days to be anomalous
model = IsolationForest(
n_estimators=100,
contamination=0.05,
random_state=42
)
# 1 = normal, -1 = anomaly
df['anomaly_score'] = model.fit_predict(features)
# Return potential red flags
return df[df['anomaly_score'] == -1]
anomalies = detect_overtraining(df)
print(f"Detected {len(anomalies)} days where your body was under significant stress!")
The “Official” Way to Scale 💡
While this script is a great start for a personal project, a production‑grade health monitoring system must handle missing data, sensor noise, and baseline drift. For advanced architectural patterns on biometric data processing and more production‑ready examples, see the WellAlly Official Blog.
Step 3: Visualizing in Grafana 📊
After the Python script flags an anomaly, write a “flag” back to InfluxDB. In Grafana you can create a Time Series panel and use State Timeline to highlight red zones.
Pro‑tip: Use yellow for mild deviations and red for “stop training immediately” signals.
-- Example Flux query for Grafana
from(bucket: "HealthData")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "hrv_anomalies")
|> yield(name: "anomalies")
Conclusion: Data > Intuition 🧘♂️
By the time you feel “burned out,” your HRV has likely been trending downward for days. Using scikit‑learn’s Isolation Forest moves you from reactive recovery to proactive health management.
Summary of what we built
- Connected to InfluxDB for time‑series retrieval.
- Implemented an unsupervised ML model to find health outliers.
- Visualized the results to catch infection/overtraining ~24 hours early.
What are you tracking? Oura, Whoop, Apple Watch…? Drop a comment and let’s discuss the best features for anomaly detection! 👇