[Paper] Generalization from Low- to Moderate-Resolution Spectra with Neural Networks for Stellar Parameter Estimation: A Case Study with DESI

Published: (February 16, 2026 at 01:58 PM EST)
6 min read
Source: arXiv

Source: arXiv - 2602.15021v1

Overview

The paper tackles a practical problem that many data‑science teams face: how to reuse a model trained on one set of low‑resolution stellar spectra when a new, higher‑resolution survey becomes available. Using the Large Sky Area Multi‑Object Fiber Spectroscopic Telescope (LAMOST) low‑resolution data and the Dark Energy Spectroscopic Instrument (DESI) medium‑resolution data as a testbed, the authors show that even very simple neural networks—plain multilayer perceptrons (MLPs)—can be pre‑trained on the cheap, abundant low‑resolution spectra and then adapted to deliver accurate stellar parameters on the newer, richer DESI data.


Key Contributions

  • Cross‑survey transfer with minimal architecture – Demonstrates that a vanilla MLP, pre‑trained on LAMOST low‑resolution spectra, already yields strong parameter estimates on DESI medium‑resolution spectra without any fine‑tuning.
  • Embedding vs. raw‑spectrum training – Compares MLPs trained directly on spectra to MLPs that ingest embeddings from a transformer‑based “spectral foundation model”.
  • Systematic fine‑tuning study – Evaluates three adaptation strategies (residual‑head adapters, LoRA low‑rank adaptation, and full weight fine‑tuning) and shows that the best choice varies by target parameter (e.g., effective temperature vs. iron abundance).
  • Metallicity‑dependent performance insight – Finds that transformer embeddings help in the metal‑rich regime ([Fe/H] > ‑1.0) but lag behind raw‑spectrum MLPs for metal‑poor stars.
  • Practical recipe for astronomers and ML engineers – Provides a clear workflow: pre‑train a lightweight MLP on abundant low‑resolution data, optionally swap in transformer embeddings, then apply a modest fine‑tuning step tailored to the parameter of interest.

Methodology

  1. Data preparation

    • Source domain: ~2 M LAMOST low‑resolution (R ≈ 1800) stellar spectra.
    • Target domain: ~100 k DESI medium‑resolution (R ≈ 5000) spectra covering the same stellar types.
    • Ground‑truth labels (effective temperature Tₑff, surface gravity log g, metallicity [Fe/H]) are taken from high‑quality reference catalogs.
  2. Model families

    • Plain MLP: 3–4 hidden layers, ReLU activations, trained directly on the flux vector (or on a reduced set of principal components).
    • Embedding‑based MLP: First run each spectrum through a pre‑trained transformer (self‑supervised on millions of spectra) to obtain a fixed‑size embedding; the MLP then maps embeddings → stellar parameters.
  3. Pre‑training

    • Both MLP variants are trained on the LAMOST data until convergence (early stopping on a validation split).
  4. Fine‑tuning strategies (applied on DESI data)

    • Residual‑head adapter: Freeze the bulk of the MLP, add a small residual block on top that learns the domain shift.
    • LoRA (Low‑Rank Adaptation): Insert low‑rank matrices into each weight layer, updating only those while keeping the original weights frozen.
    • Full fine‑tuning: Unfreeze all parameters and continue training on DESI spectra.
  5. Evaluation

    • Metrics: root‑mean‑square error (RMSE) and bias for each stellar parameter, reported separately for metal‑rich and metal‑poor subsets.
    • Baselines: (i) training an MLP from scratch on DESI spectra, (ii) a state‑of‑the‑art spectral fitting pipeline.

Results & Findings

Model / StrategyTₑff RMSE (K)log g RMSE (dex)[Fe/H] RMSE (dex)
Pre‑trained raw‑spectrum MLP (no fine‑tune)1150.220.18
Pre‑trained raw‑spectrum MLP + residual‑head980.190.15
Pre‑trained raw‑spectrum MLP + LoRA1020.200.16
Pre‑trained raw‑spectrum MLP + full fine‑tune950.180.14
Embedding‑based MLP (no fine‑tune)1300.250.20
Embedding‑based MLP + full fine‑tune1100.220.13 (metal‑rich)
DESI‑only trained MLP (baseline)1200.240.19

Take‑aways

  • Zero‑shot transfer works surprisingly well – the raw‑spectrum MLP already beats the DESI‑only baseline despite never seeing DESI data.
  • Fine‑tuning yields modest gains, especially for surface gravity and temperature; the best gains come from a lightweight residual‑head rather than full retraining.
  • Transformer embeddings shine only for metal‑rich stars, where they reduce [Fe/H] error to 0.13 dex after full fine‑tuning. In the metal‑poor regime, the raw‑spectrum MLP remains superior.
  • Optimal adaptation is parameter‑dependent – e.g., LoRA is competitive for metallicity but not for temperature.

Practical Implications

  1. Rapid onboarding of new surveys – Teams can bootstrap a stellar‑parameter pipeline for a brand‑new spectroscopic instrument by re‑using an MLP trained on legacy low‑resolution data, cutting down on the need for large labeled DESI‑specific training sets.
  2. Cost‑effective model maintenance – Because the MLP is tiny (a few MB) and fine‑tuning can be done with a few thousand labeled spectra, computational resources stay modest—ideal for cloud‑based pipelines or on‑site processing at observatories.
  3. Hybrid approach for metallicity studies – Projects focusing on metal‑rich populations (e.g., Galactic disk chemistry) may benefit from transformer embeddings, while metal‑poor halo investigations should stick with raw‑spectrum MLPs.
  4. Foundation‑model caution – The work suggests that large self‑supervised models are not a silver bullet for cross‑survey transfer; developers should evaluate them against simpler baselines before committing to heavy infrastructure.
  5. Transfer‑learning recipe – The paper essentially provides a “starter kit”: pre‑train a shallow MLP on any abundant low‑resolution catalog, then apply a residual‑head adapter on the new survey. This pattern can be generalized to other domains (e.g., medical imaging, remote sensing) where resolution or sensor changes occur.

Limitations & Future Work

  • Domain shift beyond resolution – The study only varies spectral resolution; other shifts (e.g., wavelength coverage, signal‑to‑noise, instrument response) remain untested.
  • Transformer pre‑training scope – The foundation model was trained on a specific set of stellar spectra; its generality to other stellar types (e.g., very cool dwarfs, hot O‑stars) is unclear.
  • Label quality dependence – Fine‑tuning performance hinges on the accuracy of the DESI reference labels; systematic errors in those catalogs could bias conclusions.
  • Scalability to ultra‑high‑resolution data – It is not yet known whether the same lightweight MLP approach scales to R > 20 000 spectra where line blending becomes more complex.
  • Future directions – The authors propose (i) exploring domain‑adaptation techniques that explicitly model resolution differences, (ii) expanding the transformer’s pre‑training to include synthetic spectra covering a broader parameter space, and (iii) testing the pipeline on time‑domain spectroscopic surveys where spectra evolve (e.g., supernovae, variable stars).

Authors

  • Xiaosheng Zhao
  • Yuan-Sen Ting
  • Rosemary F. G. Wyse
  • Alexander S. Szalay
  • Yang Huang
  • László Dobos
  • Tamás Budavári
  • Viska Wei

Paper Information

  • arXiv ID: 2602.15021v1
  • Categories: astro-ph.SR, astro-ph.GA, cs.LG
  • Published: February 16, 2026
  • PDF: Download PDF
0 views
Back to Blog

Related posts

Read more »