[Paper] ROIDS: Robust Outlier-Aware Informed Down-Sampling
Source: arXiv - 2601.19477v1
Overview
Informed down-sampling (IDS) is known to improve performance in symbolic regression when combined with various selection strategies, especially tournament selection. However, recent work found that IDS’s gains are not consistent across all problems. Our analysis reveals that IDS performance is worse for problems containing outliers. IDS systematically favors including outliers in subsets, which pushes GP towards finding solutions that overfit to outliers.
To address this, we introduce ROIDS (Robust Outlier-Aware Informed Down-Sampling), which excludes potential outliers from the sampling process of IDS. With ROIDS it is possible to keep the advantages of IDS without overfitting to outliers and to compete on a wide range of benchmark problems. This is reflected in our experiments, where ROIDS shows the desired behavior on all studied benchmark problems. ROIDS consistently outperforms IDS on synthetic problems with added outliers as well as on a wide range of complex real‑world problems, surpassing IDS on over 80 % of the real‑world benchmark problems. Moreover, compared to all studied baseline approaches, ROIDS achieves the best average rank across all tested benchmark problems. This robust behavior makes ROIDS a reliable down‑sampling method for selection in symbolic regression, especially when outliers may be included in the data set.
Key Contributions
- cs.NE
Methodology
Please refer to the full paper for detailed methodology.
Practical Implications
This research contributes to the advancement of cs.NE.
Authors
- Alina Geiger
- Martin Briesch
- Dominik Sobania
- Franz Rothlauf
Paper Information
- arXiv ID: 2601.19477v1
- Categories: cs.NE
- Published: January 27, 2026
- PDF: Download PDF