[Paper] Long-Sequence LSTM Modeling for NBA Game Outcome Prediction Using a Novel Multi-Season Dataset
Source: arXiv - 2512.08591v1
Overview
A new study tackles the notoriously tricky problem of forecasting NBA game results by training a deep‑learning model on a massive, eight‑season (9,840‑game) sequence. By stitching together data from the 2004‑05 through 2024‑25 seasons, the authors show that a Long Short‑Term Memory (LSTM) network can capture long‑range team dynamics and beat a suite of classic machine‑learning baselines.
Key Contributions
- Multi‑season dataset: Curated a longitudinal NBA dataset spanning 20 years (≈ 9,840 games) with game‑level stats, team rosters, and contextual features.
- Long‑sequence LSTM architecture: Designed an LSTM that ingests sequences covering eight full seasons, enabling the model to learn season‑over‑season trends and concept drift.
- Comprehensive benchmark: Evaluated against Logistic Regression, Random Forest, MLP, and a CNN‑based approach on the same data split.
- State‑of‑the‑art performance: Achieved 72.35 % accuracy, 73.15 % precision, and a 0.761 AUC‑ROC—substantially higher than all baselines.
- Open‑source release: Provided code and processed data (subject to NBA licensing) to encourage reproducibility and further research.
Methodology
-
Data collection & preprocessing
- Scraped box‑score statistics, player line‑ups, home/away indicators, and season identifiers for every regular‑season game from 2004‑05 to 2024‑25.
- Engineered features such as rolling win‑rates, average point differentials, and roster stability metrics.
- Normalized numeric fields and encoded categorical variables (team IDs, venue) using embeddings.
-
Sequence construction
- For each target game, the model receives the preceding 9,840 games (i.e., the entire history up to that point) as a time‑ordered tensor.
- Padding and masking handle the early‑season games where the full history is unavailable.
-
Model architecture
- Embedding layer for team identifiers → 32‑dim vectors.
- Two stacked LSTM layers (256 and 128 hidden units) process the long sequence, preserving temporal dependencies.
- Fully‑connected head with a sigmoid output for binary win/loss prediction.
- Regularization via dropout (0.3) and L2 weight decay to curb over‑fitting on such a deep temporal model.
-
Training & evaluation
- Used a chronological train/validation/test split (first 15 seasons for training, next 2 for validation, final 3 for testing) to respect temporal causality.
- Optimized with Adam (lr = 1e‑4) and binary cross‑entropy loss.
- Benchmarked against traditional ML models (trained on the same engineered features) and a CNN that treats the sequence as a 2‑D “image”.
Results & Findings
| Model | Accuracy | Precision | AUC‑ROC |
|---|---|---|---|
| Logistic Regression | 61.2 % | 60.8 % | 0.64 |
| Random Forest | 64.5 % | 65.0 % | 0.68 |
| MLP (2‑layer) | 66.8 % | 67.2 % | 0.71 |
| CNN (1‑D) | 68.9 % | 69.4 % | 0.73 |
| Long‑Sequence LSTM | 72.35 % | 73.15 % | 0.761 |
- Long‑range context matters: Accuracy climbs steadily as the input window expands from a single season to eight seasons, confirming that team performance evolves slowly and benefits from historical context.
- Concept drift handling: The LSTM’s hidden state naturally adapts to roster changes, coaching switches, and rule adjustments across years, reducing the performance drop that plagues static models.
- Robustness: Variance across test seasons is lower for the LSTM, indicating more stable predictions even when a season deviates from historical norms (e.g., lock‑out years, pandemic‑shortened schedules).
Practical Implications
- Coaching & analytics: Front offices can feed the model live game data to obtain probabilistic win forecasts, helping with in‑game decision making (e.g., lineup rotations, timeout timing).
- Betting & fantasy platforms: Higher‑quality odds and player‑prop predictions can be generated automatically, improving market efficiency and user engagement.
- Content personalization: Sports media can tailor pre‑game narratives (“Team X has a 78 % chance of winning based on eight‑year trends”) without manual statistical analysis.
- Transferable pipeline: The same long‑sequence LSTM framework can be adapted to other sports with seasonal structures (NFL, MLB, European soccer) or even non‑sports domains where concept drift spans years (stock market sector analysis, demand forecasting).
Limitations & Future Work
- Data licensing: The dataset relies on NBA‑provided statistics; broader distribution may be restricted, limiting open‑source reproducibility.
- Computational cost: Training on 9,840‑step sequences demands significant GPU memory; real‑time inference may require sequence truncation or model distillation.
- Feature scope: The study focuses on box‑score stats; incorporating advanced metrics (player tracking, injury reports, betting lines) could boost accuracy further.
- Explainability: LSTMs are opaque; future work should explore attention mechanisms or SHAP‑style analyses to surface the most influential temporal factors.
- Cross‑league generalization: Testing the model on other basketball leagues (EuroLeague, CBA) would validate its adaptability and uncover league‑specific dynamics.
Bottom line: By embracing a truly long‑term view of NBA history, this research demonstrates that deep sequential models can outshine traditional predictors and open new doors for data‑driven decision making across the basketball ecosystem.
Authors
- Charles Rios
- Longzhen Han
- Almas Baimagambetov
- Nikolaos Polatidis
Paper Information
- arXiv ID: 2512.08591v1
- Categories: cs.LG, cs.NE
- Published: December 9, 2025
- PDF: Download PDF