[Paper] Improving Credit Card Fraud Detection with an Optimized Explainable Boosting Machine
Source: arXiv - 2602.06955v1
Overview
The paper presents an upgraded workflow for credit‑card fraud detection that couples the Explainable Boosting Machine (EBM)—a highly interpretable variant of gradient‑boosted additive models—with systematic hyper‑parameter and preprocessing optimization. By sidestepping traditional resampling tricks for class‑imbalance, the authors achieve a ROC‑AUC of 0.983, beating both earlier EBM versions and a suite of popular black‑box models while keeping the model’s decision logic transparent.
Key Contributions
- Optimized EBM pipeline: Combines feature selection, scaling order, and hyper‑parameter tuning into a single, reproducible workflow.
- Taguchi‑based design of experiments: Uses the Taguchi method to efficiently explore the interaction between data scalers and model hyper‑parameters, reducing the number of trial runs while still finding near‑optimal settings.
- Bias‑free handling of class imbalance: Avoids oversampling/undersampling that can distort fraud patterns, instead relying on model‑level regularization and feature engineering.
- State‑of‑the‑art performance: Achieves ROC‑AUC = 0.983 on the canonical credit‑card fraud dataset, surpassing Logistic Regression, Random Forest, XGBoost, and Decision Tree baselines.
- Interpretability at scale: Provides clear visualizations of feature importance and pairwise interactions, enabling auditors and fraud analysts to trace why a transaction was flagged.
Methodology
-
Data preprocessing
- Standard scaling, min‑max scaling, and robust scaling are evaluated in different sequences.
- Missing‑value handling and outlier clipping are applied uniformly across all experiments.
-
Feature selection
- Recursive feature elimination (RFE) and mutual‑information ranking prune the original 30‑plus variables down to the most predictive subset, reducing noise for the EBM.
-
Explainable Boosting Machine (EBM)
- EBM builds an additive model of shape functions for each feature plus selected pairwise interaction terms, preserving interpretability.
- Hyper‑parameters such as the number of outer/inner bags, learning rate, and maximum number of interaction terms are tuned.
-
Taguchi optimization
- A Taguchi orthogonal array defines a compact set of experiments covering all combinations of scalers and EBM hyper‑parameters.
- Signal‑to‑noise ratios (larger‑the‑better) guide the selection of the best configuration, dramatically cutting down the search space compared with grid or random search.
-
Evaluation
- 5‑fold cross‑validation on the publicly available “Credit Card Fraud Detection” dataset (284,807 transactions, 0.172 % fraud).
- Primary metric: ROC‑AUC; secondary metrics include precision‑recall curves and confusion‑matrix‑derived fraud‑catch rates.
Results & Findings
| Model | ROC‑AUC | Precision @ 0.01 FPR |
|---|---|---|
| Optimized EBM (this work) | 0.983 | 0.71 |
| Baseline EBM (original) | 0.975 | 0.66 |
| Logistic Regression | 0.945 | 0.48 |
| Random Forest | 0.962 | 0.58 |
| XGBoost | 0.970 | 0.62 |
| Decision Tree | 0.912 | 0.35 |
- The Taguchi‑driven scaler‑hyper‑parameter combo contributed ~0.006 AUC gain over the baseline EBM.
- Interaction plots reveal that transaction amount combined with time‑since‑last‑transaction is the strongest predictor of fraud, a pattern that aligns with domain expertise.
- Model training time remains modest (≈ 2 minutes on a single CPU core), making the approach feasible for near‑real‑time scoring pipelines.
Practical Implications
- Deployable in production: The optimized EBM runs efficiently on CPU, eliminating the need for GPU clusters often required by deep‑learning fraud detectors.
- Regulatory compliance: Because the model’s decision logic is human‑readable, financial institutions can satisfy audit trails and explainability mandates (e.g., GDPR, Basel III).
- Reduced false positives: Higher precision at low false‑positive rates translates to fewer legitimate transactions being blocked, improving customer experience.
- Feature‑driven risk controls: The interaction insights can be fed back into rule‑based systems or used to prioritize manual review queues.
- Reusable workflow: The Taguchi optimization framework can be adapted to other imbalanced classification problems (e.g., insurance claim fraud, intrusion detection).
Limitations & Future Work
- Dataset scope: Experiments are limited to a single, publicly available dataset; real‑world data may exhibit different temporal drift or feature distributions.
- Scalability of interactions: EBM currently supports only a limited number of pairwise interactions; extending to higher‑order terms could capture more complex fraud patterns but at a cost to interpretability.
- Online learning: The current pipeline is batch‑oriented; integrating incremental updates for streaming transaction data remains an open challenge.
- Broader hyper‑parameter search: While Taguchi reduces experiments, it may miss non‑linear effects that more exhaustive Bayesian optimization could uncover.
Overall, the study demonstrates that a carefully tuned, interpretable model can rival—or even surpass—state‑of‑the‑art black‑box classifiers in fraud detection, offering a compelling path forward for trustworthy AI in finance.
Authors
- Reza E. Fazel
- Arash Bakhtiary
- Siavash A. Bigdeli
Paper Information
- arXiv ID: 2602.06955v1
- Categories: cs.LG
- Published: February 6, 2026
- PDF: Download PDF