[Paper] Order Matters in Retrosynthesis: Structure-aware Generation via Reaction-Center-Guided Discrete Flow Matching

Published: (February 13, 2026 at 12:39 PM EST)
4 min read
Source: arXiv

Source: arXiv - 2602.13136v1

Overview

A new paper tackles retrosynthesis—the problem of figuring out how to make a target molecule—from a fresh angle: the order in which atoms are presented to a neural network matters. By deliberately placing the atoms that form the reaction center at the front of the input sequence, the authors turn implicit chemical knowledge into a simple positional cue that modern graph transformers can exploit. The result is a template‑free system that reaches state‑of‑the‑art accuracy while needing far fewer inference steps than previous diffusion‑based models.

Key Contributions

  • Positional inductive bias: Introduces a “reaction‑center‑first” atom ordering that makes the most chemically relevant substructure easy for the model to spot.
  • RetroDiT backbone: A graph transformer equipped with rotary position embeddings that directly consumes the ordered atom sequence.
  • Discrete flow matching: Decouples training from sampling, allowing the model to generate retrosynthetic routes in 20‑50 steps (vs. ~500 steps for earlier diffusion approaches).
  • Strong empirical results: Sets new top‑1 accuracy records on USPTO‑50K (61.2 %) and USPTO‑Full (51.3 %) with predicted reaction centers; jumps to 71.1 % / 63.4 % when oracle centers are supplied.
  • Efficiency over scale: Shows that a 280 K‑parameter model with the ordering trick matches the performance of a 65 M‑parameter model lacking it, highlighting the power of structural priors over brute‑force scaling.

Methodology

  1. Two‑stage view of a reaction – First, identify the reaction center (atoms whose bonds change); second, reconstruct the full precursor molecules.
  2. Atom ordering as a bias – The authors reorder the graph’s node list so that reaction‑center atoms appear at the beginning of the sequence fed to the transformer. This turns “where the chemistry happens” into a simple positional pattern.
  3. RetroDiT architecture – A graph transformer that processes the ordered node list, using rotary position embeddings to preserve relative order information without sacrificing permutation invariance of the rest of the graph.
  4. Discrete flow matching – Instead of learning a continuous diffusion process, the model learns a discrete transformation that directly maps a latent “noise” graph to a valid precursor graph. Training is done once; at inference time the model can step through a short, fixed number of discrete transitions (20‑50) to produce a candidate synthesis route.
  5. Reaction‑center prediction – A lightweight classifier predicts the reaction center from the target molecule; its output guides the ordering for the main generator.

Results & Findings

DatasetSettingTop‑1 Accuracy
USPTO‑50KPredicted centers61.2 %
USPTO‑FullPredicted centers51.3 %
USPTO‑50KOracle (ground‑truth) centers71.1 %
USPTO‑FullOracle centers63.4 %
  • Speed: Generation requires only 20‑50 discrete flow steps, a 10×‑25× speed‑up over prior diffusion‑based retrosynthesis models that needed ~500 steps.
  • Parameter efficiency: A 0.28 M‑parameter RetroDiT matches a 65 M‑parameter baseline that lacks the ordering bias, confirming that the structural prior is more valuable than sheer model size.
  • Data efficiency: The approach outperforms large foundation models trained on 10 B reactions, despite using only the standard USPTO datasets (≈1 M reactions).

Practical Implications

  • Faster AI‑assisted synthesis planning: Chemists can obtain candidate routes in seconds rather than minutes, enabling tighter integration into interactive design tools and automated lab workflows.
  • Reduced compute costs: The discrete flow matching scheme and small model size lower GPU memory and inference time, making deployment feasible on on‑premise servers or even high‑end workstations.
  • Better generalization with limited data: By encoding domain knowledge as a simple ordering, companies with proprietary reaction databases (often far smaller than public corpora) can train competitive retrosynthesis models without needing massive data collection.
  • Plug‑and‑play reaction‑center predictor: The modular design lets developers swap in a more sophisticated center‑prediction model (e.g., a graph‑based classifier fine‑tuned on a specific chemistry domain) to further boost accuracy.
  • Potential for downstream automation: The short, deterministic generation pipeline is well‑suited for coupling with robotic synthesis platforms that require rapid, reliable route suggestions.

Limitations & Future Work

  • Reliance on accurate reaction‑center prediction: If the center classifier errs, the ordering cue can mislead the generator, degrading performance.
  • Template‑free but still heuristic: While the model does not use explicit templates, the discrete flow steps are handcrafted; exploring fully learned flow dynamics could yield further gains.
  • Scalability to exotic chemistries: The benchmarks focus on patent reactions; extending to organometallic or biocatalytic transformations may require additional domain‑specific priors.
  • Integration with multi‑step planning: The paper evaluates single‑step retrosynthesis; future work could embed the method into a recursive planner that assembles multi‑step synthetic routes.

Authors

  • Chenguang Wang
  • Zihan Zhou
  • Lei Bai
  • Tianshu Yu

Paper Information

  • arXiv ID: 2602.13136v1
  • Categories: cs.LG
  • Published: February 13, 2026
  • PDF: Download PDF
0 views
Back to Blog

Related posts

Read more »