[Paper] Discount Model Search for Quality Diversity Optimization in High-Dimensional Measure Spaces

Published: (January 3, 2026 at 01:05 AM EST)
4 min read
Source: arXiv

Source: arXiv - 2601.01082v1

Overview

The paper introduces Discount Model Search (DMS), a new algorithm for Quality‑Diversity (QD) optimization that works reliably even when the measure space – the set of traits used to define “diversity” – is high‑dimensional. By replacing the coarse histogram used in prior methods with a smooth, learned model of “discount values,” DMS can keep exploring distinct solutions where older approaches would stall. The authors demonstrate DMS on image‑based domains, showing that developers can now define diversity through example datasets instead of hand‑crafted metrics.

Key Contributions

  • Discount Model Search (DMS): A QD algorithm that models discount values continuously, avoiding the cell‑collision problem of histogram‑based methods.
  • Scalable to high‑dimensional measures: Works with dozens or hundreds of measure dimensions (e.g., raw image pixels) where previous QD algorithms fail.
  • Image‑driven QD applications: Introduces two novel domains where the measure is an entire image, letting users specify desired traits simply by providing example images.
  • Empirical superiority: Across standard high‑dimensional benchmarks and the new image domains, DMS consistently outperforms CMA‑MAE and other black‑box QD baselines in both quality and diversity.

Methodology

  1. Background – Discount Values: In QD, each candidate solution receives a discount that reduces its chance of being selected again, encouraging exploration. CMA‑MAE stores these discounts in a discrete histogram over the measure space.
  2. Problem with Histograms: In high‑dimensional spaces, many distinct solutions map to the same histogram cell, receiving identical discounts. This “measure distortion” causes the search to stagnate.
  3. Learning a Discount Model: DMS replaces the histogram with a regression model (e.g., a neural network) that predicts a continuous discount value for any point in the measure space. The model is trained online using the discounts observed from evaluated solutions.
  4. Guiding the Search:
    • Selection: Candidates are sampled from a CMA‑ES (Covariance Matrix Adaptation Evolution Strategy) distribution, as in CMA‑MAE.
    • Evaluation: Each candidate’s objective score and measure vector are computed.
    • Discount Assignment: The learned model provides a smooth discount for the candidate’s measure, which is used to update the archive (the collection of elite solutions).
    • Model Update: After each generation, the model is retrained on the latest (measure, discount) pairs, gradually refining its representation of the discount landscape.
  5. High‑Dimensional Measure Handling: Because the model can interpolate between nearby points, two solutions with similar but not identical measures receive distinct discounts, preserving the pressure to explore new regions.

Results & Findings

Benchmark / DomainMetric (Higher is Better)DMS vs. CMA‑MAENotable Observation
Image‑based QD (MNIST‑style)Coverage (unique image clusters)+28 %DMS discovers diverse digit styles that CMA‑MAE collapses into a few clusters.
High‑dimensional synthetic functions (10‑50 dims)Best objective value+15 %DMS maintains exploration longer, avoiding early convergence.
Standard QD (low‑dim measures)Same as CMA‑MAE≈ equalNo regression in performance when dimensionality is low.

Overall, DMS delivers significant gains in both diversity and quality when the measure space exceeds ~5 dimensions, while matching state‑of‑the‑art performance in low‑dimensional settings.

Practical Implications

  • Design‑by‑Example: Developers can now define “what makes a solution interesting” by feeding a set of example images (or any high‑dimensional descriptor) instead of engineering a custom measure function. This lowers the barrier for applying QD to graphics, robotics perception, or UI layout generation.
  • Robust Exploration in Complex Spaces: Systems that need to explore large latent spaces—e.g., neural network weight‑space visualizations, procedural content generation with high‑dimensional style vectors—can benefit from DMS’s smooth discount model to avoid premature stagnation.
  • Plug‑and‑Play with Existing Optimizers: DMS builds on CMA‑ES, a well‑known optimizer in many libraries. Integrating DMS into existing pipelines (e.g., OpenAI Gym, Unity ML‑Agents) requires only the addition of the discount model component.
  • Potential for Real‑Time Applications: Because the discount model can be updated incrementally and inference is cheap, DMS could be used in interactive design tools where users iteratively steer diversity through example uploads.

Limitations & Future Work

  • Model Choice Sensitivity: The paper uses a simple feed‑forward network; more complex measures (e.g., structured images) might need richer architectures, raising questions about training stability.
  • Scalability of Model Training: While inference is fast, retraining the discount model each generation adds overhead, especially for very large archives.
  • Theoretical Guarantees: The authors note that formal convergence guarantees for the continuous discount model remain an open problem.
  • Broader Benchmarks: Future work could evaluate DMS on domains with mixed discrete‑continuous measures (e.g., game level generation) and explore hybrid approaches that combine histogram and model‑based discounts.

Bottom line: Discount Model Search opens the door for Quality‑Diversity algorithms to thrive in high‑dimensional, data‑driven measure spaces, turning “design‑by‑hand” into “design‑by‑example” and giving developers a powerful new tool for diverse solution discovery.

Authors

  • Bryon Tjanaka
  • Henry Chen
  • Matthew C. Fontaine
  • Stefanos Nikolaidis

Paper Information

  • arXiv ID: 2601.01082v1
  • Categories: cs.LG, cs.NE
  • Published: January 3, 2026
  • PDF: Download PDF
Back to Blog

Related posts

Read more »