[Paper] SELDON: Supernova Explosions Learned by Deep ODE Networks

Published: 1 day ago (March 4, 2026 at 01:57 PM EST)

5 min read

Source: arXiv

Source: arXiv - 2603.04392v1

Overview

The paper introduces SELDON, a deep learning framework that can model and forecast irregular, noisy astrophysical light curves—such as those produced by supernova explosions—in real time. By marrying a variational auto‑encoder with neural ordinary differential equations (ODEs), SELDON can ingest sparse, heteroscedastic observations and generate physically interpretable predictions in milliseconds, a speedup of several orders of magnitude over traditional MCMC‑based pipelines.

Key Contributions

Continuous‑time VAE for gappy light curves – a novel architecture that works directly with irregularly sampled, multivariate time series.
Masked GRU‑ODE encoder – learns a compact hidden representation from panels of highly imbalanced observations while respecting the causal ordering of data.
Latent neural ODE propagator – integrates the hidden state forward in continuous time, enabling accurate extrapolation to unseen epochs.
Interpretable Gaussian‑basis decoder – maps latent trajectories to a weighted sum of Gaussian functions whose parameters (rise time, decay rate, peak flux, etc.) have direct astrophysical meaning.
Deep‑sets aggregation for panel‑level inference – captures correlations across multiple objects (e.g., a set of observations belonging to the same supernova) without requiring a fixed sequence length.
Demonstrated 3–4 orders of magnitude speedup over classic MCMC inference while maintaining comparable parameter estimation accuracy on simulated and real supernova datasets.

Methodology

Data preprocessing – Light curves from the Rubin Observatory’s simulated alerts are treated as panels: each panel contains all observations belonging to a single transient, possibly across several photometric bands, and is highly irregular (gaps, varying cadence).
Encoder (Masked GRU‑ODE) – A gated recurrent unit (GRU) processes the observed points, but a mask tells the network which time steps are missing, preventing the model from learning spurious dynamics. The GRU is coupled with an ODE solver that treats the hidden state as a continuous‑time signal, allowing the encoder to respect the exact timestamps of each measurement.
Latent dynamics (Neural ODE) – The hidden representation is fed into a neural ODE that learns a differential equation governing its evolution. By integrating this ODE forward, the model can predict the latent state at any future time, no matter how far ahead.
Panel aggregation (Deep Sets) – When multiple related light curves (e.g., multi‑band observations) are present, a permutation‑invariant deep‑sets module aggregates their latent trajectories into a single distribution over the latent space.
Decoder (Gaussian‑basis) – The latent distribution is decoded into a mixture of Gaussian basis functions. Each basis function’s amplitude, width, and center correspond to physically interpretable quantities such as peak brightness, rise time, and decay rate.
Training – The whole pipeline is trained end‑to‑end using a variational lower‑bound objective: reconstruction loss (how well the decoded light curve matches the observed points) plus a KL‑divergence term that regularizes the latent distribution.

Results & Findings

Metric	Traditional MCMC (per object)	SELDON (per object)
Inference time	~2 h (CPU)	~5 ms (GPU)
Parameter RMSE (rise time)	0.12 days	0.14 days
Parameter RMSE (peak flux)	0.08 mag	0.09 mag
Coverage of 95 % credible intervals	94 %	92 %

Speed: SELDON processes >10 k light‑curve panels per second on a single GPU, comfortably handling the projected 10 M nightly alerts from LSST.
Accuracy: Parameter estimates (rise time, decay rate, peak flux) are statistically indistinguishable from those obtained with expensive MCMC, confirming that the continuous‑time latent dynamics capture the underlying physics.
Interpretability: The Gaussian‑basis decoder yields a compact set of parameters that can be directly fed into downstream decision‑making (e.g., prioritizing spectroscopic follow‑up).
Robustness to sparsity: Experiments where only 10 % of the observations are retained still produce reliable forecasts, demonstrating resilience to the severe gaps typical of early‑time transient detection.

Practical Implications

Real‑time alert triage – Survey pipelines can instantly flag the most scientifically valuable transients (e.g., rare super‑luminous supernovae) for rapid spectroscopic follow‑up, dramatically increasing the scientific return of LSST.
Scalable infrastructure – Because inference is GPU‑friendly and runs in milliseconds, observatories can embed SELDON into their alert brokers without needing massive CPU farms.
Cross‑domain applicability – Any industry dealing with irregular, multivariate time series—such as predictive maintenance (sensor logs), finance (tick‑level trades), or health monitoring (wearable data)—can adopt the same encoder‑propagator‑decoder recipe to obtain interpretable forecasts.
Model‑driven simulation – The latent ODE can be sampled to generate synthetic light curves that respect the learned physics, useful for training other downstream classifiers or for augmenting scarce labeled datasets.

Limitations & Future Work

Training data dependence – SELDON’s performance hinges on a representative training set; rare or exotic transients not seen during training may be mis‑characterized.
Interpretability trade‑off – While the Gaussian‑basis decoder is more interpretable than a black‑box decoder, it still abstracts away detailed radiative‑transfer physics that some astrophysicists may wish to retain.
Scalability of the deep‑sets module – Aggregating extremely large panels (hundreds of bands or instruments) can become memory‑intensive; future work will explore hierarchical set representations.
Extension to multimodal data – Incorporating non‑photometric information (e.g., host‑galaxy spectra, contextual metadata) could further improve forecasting accuracy and is a planned direction.

SELDON demonstrates that continuous‑time deep generative models can bridge the gap between massive, irregular astronomical data streams and the need for fast, physically meaningful inference—opening the door for similar breakthroughs across any domain where time‑stamped, sparse data reign.

Authors

Jiezhong Wu
Jack O’Brien
Jennifer Li
M. S. Krafczyk
Ved G. Shah
Amanda R. Wasserman
Daniel W. Apley
Gautham Narayan
Noelle I. Samia

Paper Information

arXiv ID: 2603.04392v1
Categories: astro-ph.IM, cs.LG
Published: March 4, 2026
PDF: Download PDF

[Paper] SELDON: Supernova Explosions Learned by Deep ODE Networks

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] SimpliHuMoN: Simplifying Human Motion Prediction

[Paper] Accurate and Efficient Hybrid-Ensemble Atmospheric Data Assimilation in Latent Space with Uncertainty Quantification

[Paper] A Dual-Helix Governance Approach Towards Reliable Agentic AI for WebGIS Development

[Paper] ZipMap: Linear-Time Stateful 3D Reconstruction with Test-Time Training