[Paper] FairRF: Multi-Objective Search for Single and Intersectional Software Fairness

Published: 1 week ago (January 12, 2026 at 08:42 AM EST)

4 min read

Source: arXiv

Source: arXiv - 2601.07537v1

Overview

The paper introduces FairRF, a multi‑objective evolutionary search technique that simultaneously tunes a Random Forest classifier’s hyper‑parameters and mutates its training data to improve both fairness (reducing bias) and effectiveness (prediction accuracy). By returning a Pareto front of trade‑off solutions, FairRF lets product owners, data scientists, and engineers pick the model that best matches their fairness‑vs‑performance priorities.

Key Contributions

Multi‑objective evolutionary search for fairness – combines fairness and effectiveness as first‑class optimization goals rather than applying a single post‑hoc bias‑mitigation step.
Hyper‑parameter + data mutation search – simultaneously explores Random Forest settings (e.g., number of trees, max depth) and systematic data transformations (re‑sampling, label flipping) that can reduce bias.
Pareto‑optimal solution set – delivers a portfolio of models, each representing a different fairness‑effectiveness trade‑off, enabling stakeholder‑driven selection.
Comprehensive empirical evaluation – compared against 26 baselines (including state‑of‑the‑art bias‑mitigation methods) across 11 classification scenarios, using five effectiveness metrics and three fairness metrics plus two intersectional variants (six fairness definitions total).
Superior performance on intersectional bias – outperforms the previous best method for mitigating bias that affects overlapping protected groups (e.g., race + gender).

Methodology

Base learner – Random Forest (RF) is chosen for its popularity and flexibility.
Search space
- RF hyper‑parameters: number of trees, max depth, min samples split, etc.
- Data mutation operators: oversampling/undersampling of minority groups, label smoothing, synthetic example generation.
Evolutionary algorithm – a multi‑objective genetic algorithm (e.g., NSGA‑II) evolves a population of candidate configurations. Each candidate is evaluated on:
- Effectiveness – accuracy, F1‑score, AUC, etc.
- Fairness – statistical parity difference, equalized odds, and their intersectional extensions.
Pareto front extraction – after a fixed number of generations, non‑dominated solutions (no other solution is better on both objectives) are returned.
Benchmarking – the authors run FairRF and 26 baseline methods (pre‑processing, in‑processing, post‑processing techniques) on publicly available datasets (e.g., Adult, COMPAS) and report average ranks across all metrics.

Results & Findings

Metric	FairRF vs. Baselines
Fairness improvement	Up to +30% reduction in disparity compared to the raw RF, and +12% over the previous best intersectional method.
Effectiveness retention	Prediction accuracy stays within 1–2% of the best‑only‑accuracy baseline, showing minimal trade‑off cost.
Stability across definitions	FairRF consistently yields better or comparable fairness scores across all six fairness definitions, whereas many baselines excel only on a single metric.
Pareto diversity	The generated front contains 8–12 distinct models per run, giving developers concrete options rather than a single “one‑size‑fits‑all” model.

In short, FairRF not only makes models fairer but does so without sacrificing the core predictive power that production systems rely on.

Practical Implications

Developer‑friendly fairness tuning – Instead of manually fiddling with re‑sampling or adding fairness constraints, teams can plug FairRF into their CI pipeline and let the evolutionary search surface a set of ready‑to‑deploy models.
Stakeholder negotiation – Product managers can visualize the fairness‑vs‑accuracy trade‑off curve and make data‑driven decisions (e.g., “we accept a 0.5% drop in accuracy for a 15% drop in gender bias”).
Intersectional compliance – Regulations increasingly require proof that systems treat combined protected attributes fairly. FairRF’s built‑in support for intersectional metrics helps meet GDPR, EEOC, or sector‑specific fairness audits.
Extensible to other learners – While the paper focuses on Random Forests, the same evolutionary framework can wrap other tree‑based ensembles (XGBoost, LightGBM) or even neural nets, making it a versatile addition to any ML toolbox.
Reduced engineering overhead – By automating both hyper‑parameter tuning and bias mitigation, teams save time compared to running separate fairness‑specific pipelines.

Limitations & Future Work

Computational cost – Multi‑objective evolutionary search can be expensive on large datasets; the authors note longer runtimes compared to single‑objective tuning.
Random Forest focus – The current implementation is tied to RF; extending to deep learning models may require redesigning mutation operators.
Metric selection – FairRF optimizes the metrics you feed it; choosing appropriate fairness definitions for a given domain remains a non‑trivial, domain‑expert task.
Scalability to streaming data – The approach assumes a static training set; future work could explore incremental or online versions for real‑time systems.

FairRF demonstrates that fairness need not be an after‑thought bolt‑on. By treating fairness as a first‑class optimization objective and delivering a portfolio of trade‑off models, it gives developers the practical levers they need to build responsible AI systems at scale.

Authors

Giordano d’Alosio
Max Hort
Rebecca Moussa
Federica Sarro

Paper Information

arXiv ID: 2601.07537v1
Categories: cs.SE
Published: January 12, 2026
PDF: Download PDF

[Paper] FairRF: Multi-Objective Search for Single and Intersectional Software Fairness

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Applying Formal Methods Tools to an Electronic Warfare Codebase (Experience report)

[Paper] A Practical Guide to Establishing Technical Debt Management

[Paper] RITA: A Tool for Automated Requirements Classification and Specification from Online User Feedback

[Paper] Automation and Reuse Practices in GitHub Actions Workflows: A Practitioner's Perspective