[Paper] An interpretable data-driven approach to optimizing clinical fall risk assessment
Source: arXiv - 2601.05194v1
Overview
The authors present a data‑driven, yet fully interpretable, method for improving the Johns Hopkins Fall Risk Assessment Tool (JHFRAT). By re‑weighting the existing additive score with a constrained optimization technique, they boost predictive accuracy without changing the tool’s workflow, making the approach attractive for health‑system engineers who need both performance and auditability.
Key Contributions
- Constrained Score Optimization (CSO): A lightweight algorithm that adjusts item weights of an existing clinical score while preserving its additive form and clinical cut‑offs.
- Large‑scale retrospective validation: Tested on 54 k inpatient admissions across three hospitals, with a balanced subset of high‑ and low‑risk cases.
- Performance gain: CSO raises the AUC‑ROC from 0.86 (original JHFRAT) to 0.91, equivalent to correctly flagging ~35 extra high‑risk patients per week.
- Interpretability vs. black‑box trade‑off: Shows that a modest, transparent model can approach the performance of a black‑box XGBoost model (AUC‑ROC = 0.94) while remaining robust to label noise.
- Deployment‑ready: No changes to EHR integration or user interface are required; only the numeric weights are updated.
Methodology
- Data collection: Extracted structured EHR fields (demographics, vitals, medication, prior falls, etc.) for 54 209 admissions (Mar 2022–Oct 2023).
- Label definition: “High fall risk” vs. “low fall risk” derived from clinician‑reviewed outcomes, yielding 20 208 high‑risk and 13 941 low‑risk encounters.
- Baseline model: The original JHFRAT, an additive score with pre‑defined item weights and thresholds.
- Constrained Score Optimization:
- Formulated as a convex optimization problem that minimizes a loss (e.g., logistic loss) subject to linear constraints preserving the original score’s structure (non‑negative weights, monotonicity, and fixed threshold values).
- Solved using standard solvers (e.g., CVXPY) to obtain new weights that better align the score with the study’s risk labels.
- Comparative models: Trained a constrained logistic regression (knowledge‑based) and a gradient‑boosted tree (XGBoost) as black‑box baselines.
- Evaluation: Measured AUC‑ROC, calibration, and robustness to label perturbations on a held‑out test set.
Results & Findings
| Model | AUC‑ROC | Calibration Δ | Extra high‑risk patients captured / week |
|---|---|---|---|
| Original JHFRAT | 0.86 | – | – |
| CSO (re‑weighted) | 0.91 | Improved | ~35 |
| Constrained Logistic Regression | 0.89 | Slightly better than JHFRAT | – |
| XGBoost (black‑box) | 0.94 | Best calibration but less interpretable | – |
- The CSO model consistently outperformed the legacy tool across all hospitals.
- Adding extra EHR variables to CSO did not materially change performance, indicating the original item set already captures most predictive signal.
- The black‑box XGBoost achieved the highest AUC but showed greater sensitivity to how “high risk” was labeled, raising concerns for deployment stability.
Practical Implications
- Rapid integration: Health‑IT teams can replace the static weight table in the existing JHFRAT module with the CSO‑derived weights—no UI redesign or new data pipelines needed.
- Regulatory friendliness: Maintaining the additive, rule‑based structure satisfies audit requirements and facilitates explainability to clinicians and compliance officers.
- Resource allocation: More accurate risk stratification enables better staffing of fall‑prevention aides, potentially reducing adverse events and associated costs.
- Scalable framework: The CSO approach can be applied to other legacy clinical scores (e.g., sepsis alerts, readmission risk) where interpretability is non‑negotiable.
- Open‑source potential: The optimization formulation is simple enough to be packaged as a Python library, encouraging community contributions and cross‑institution benchmarking.
Limitations & Future Work
- Retrospective design: The study relies on historical labels; prospective validation is needed to confirm real‑world impact on fall incidence.
- Label noise: “High risk” was defined by chart review, which may introduce subjectivity; future work could explore multi‑label or probabilistic outcomes.
- Generalizability: Data come from a single health system; external validation on different hospital settings and patient populations is required.
- Dynamic risk factors: The current model uses static admission data; incorporating time‑varying vitals or sensor data could further improve predictions.
- Automation of constraint selection: Future research could explore learning the constraint set itself, balancing interpretability with flexibility.
Authors
- Fardin Ganjkhanloo
- Emmett Springer
- Erik H. Hoyer
- Daniel L. Young
- Holley Farley
- Kimia Ghobadi
Paper Information
- arXiv ID: 2601.05194v1
- Categories: cs.LG
- Published: January 8, 2026
- PDF: Download PDF