[Paper] ReaSeq: Unleashing World Knowledge via Reasoning for Sequential Modeling

Published: 1 month ago (December 24, 2025 at 11:06 AM EST)

4 min read

Source: arXiv

Source: arXiv - 2512.21257v1

Overview

ReaSeq is a new framework that injects the world knowledge stored in Large Language Models (LLMs) into industrial recommender systems. By combining explicit chain‑of‑thought reasoning with latent diffusion‑based inference, it tackles two long‑standing pain points: sparse, ID‑only item embeddings and the inability to surface interests that lie outside a platform’s historical logs.

Key Contributions

Hybrid reasoning pipeline – mixes explicit multi‑agent Chain‑of‑Thought (CoT) reasoning to generate structured product semantics with implicit diffusion‑based LLM reasoning that imagines plausible user actions beyond recorded clicks.
Semantic enrichment of item IDs – transforms raw item identifiers into dense, knowledge‑grounded vectors that capture attributes, usage contexts, and cross‑domain relations.
Beyond‑log behavior generation – a diffusion LLM predicts “what a user might do next” even when no prior interaction exists, effectively widening the recommendation horizon.
Large‑scale production validation – deployed on Taobao’s real‑time ranking pipeline serving hundreds of millions of users, delivering >6 % lift in click‑through‑rate (CTR) and impression‑per‑view (IPV), +2.9 % more orders, and +2.5 % growth in gross merchandise value (GMV).
Multi‑agent collaboration design – introduces a lightweight coordination protocol that lets several specialized agents (knowledge extractor, semantic mapper, behavior generator) share intermediate reasoning steps without heavy model retraining.

Methodology

Data Ingestion – Existing interaction logs (user‑item clicks, purchases) are fed to a knowledge extraction agent.
Explicit CoT Reasoning
- A set of prompts guides the LLM to break down each item into a hierarchy of attributes (category, material, style, usage scenario, etc.).
- The multi‑agent system iteratively refines these attributes, producing a structured knowledge graph per item.
- The graph is then embedded (e.g., via Graph Neural Networks) to create a semantic item vector that augments the traditional ID embedding.
Implicit Diffusion Reasoning
- A diffusion‑based LLM (e.g., Diffusion‑GPT) is conditioned on the user’s short‑term session and the enriched item vectors.
- It samples plausible future interactions that are not present in the log, effectively hallucinating “beyond‑log” interests while staying grounded by the semantic knowledge.
Fusion & Ranking
- The original collaborative‑filtering scores, the semantic vectors, and the diffusion‑generated candidate items are merged in a lightweight ranking model (often a feed‑forward network).
- Real‑time inference runs within the latency budget of Taobao’s ranking service.

The whole pipeline is modular: any LLM can be swapped in, and the reasoning steps are logged for interpretability and debugging.

Results & Findings

Metric	Log‑only baseline	ReaSeq (deployed)	Relative lift
IPV (Impression per View)	1.00	1.06	+6.0 %
CTR	0.12	0.127	+6.0 %
Orders	1,200 k	1,235 k	+2.9 %
GMV	¥1.00 B	¥1.025 B	+2.5 %

Sparse items (≤5 historical interactions) saw the biggest CTR boost (~9 %), confirming that semantic enrichment mitigates ID‑poverty.
Cold‑start users (new accounts) benefited from the diffusion‑generated candidates, with a 12 % increase in first‑day engagement.
Ablation studies showed that removing either the explicit CoT or the diffusion component reduced overall lift by ~3 % each, indicating that both reasoning modes are complementary.

Practical Implications

Improved cold‑start handling – Developers can plug ReaSeq’s semantic encoder into existing recommendation stacks to give new items a “knowledge boost” without waiting for interaction data.
Cross‑domain recommendation – Because the item knowledge graph captures universal attributes (e.g., “outdoor sport”), the same embeddings can be reused across different product categories or even different platforms.
Reduced reliance on massive logging – Companies with strict privacy constraints can still benefit from LLM‑derived world knowledge, lowering the volume of user‑level data needed for high‑quality rankings.
Interpretability for product teams – The explicit CoT steps produce human‑readable attribute lists, making it easier to audit why a recommendation surfaced (useful for compliance and trust).
Scalable architecture – The multi‑agent design runs inference in parallel and fits within typical latency SLAs (≈30 ms on commodity GPUs), meaning it can be rolled out to any high‑traffic e‑commerce site.

Limitations & Future Work

LLM hallucination risk – While diffusion reasoning is constrained by semantic vectors, occasional generation of implausible items was observed; tighter grounding mechanisms are needed.
Domain‑specific jargon – The current prompts are tuned for consumer goods; adapting to highly technical domains (e.g., B2B software) may require custom knowledge extraction pipelines.
Compute cost – Adding two LLM inference stages increases GPU usage; future work will explore distillation or quantization to keep operational expenses low.
User privacy – Although ReaSeq reduces raw log dependence, it still consumes session data; integrating differential privacy guarantees is an open research direction.

Overall, ReaSeq demonstrates that marrying world knowledge with reasoning can break the “log‑only” ceiling that many recommender systems face today, opening a path toward more intelligent, context‑aware, and universally applicable recommendation engines.

Authors

Chuan Wang
Gaoming Yang
Han Wu
Jiakai Tang
Jiahao Yu
Jian Wu
Jianwu Hu
Junjun Zheng
Shuwen Xiao
Yeqiu Yang
Yuning Jiang
Ahjol Nurlanbek
Binbin Cao
Bo Zheng
Fangmei Zhu
Gaoming Zhou
Huimin Yi
Huiping Chu
Jin Huang
Jinzhe Shan
Kenan Cui
Longbin Li
Silu Zhou
Wen Chen
Xia Ming
Xiang Gao
Xin Yao
Xingyu Wen
Yan Zhang
Yiwen Hu
Yulin Wang
Ziheng Bao
Zongyuan Wu

Paper Information

arXiv ID: 2512.21257v1
Categories: cs.IR, cs.CL
Published: December 24, 2025
PDF: Download PDF

[Paper] ReaSeq: Unleashing World Knowledge via Reasoning for Sequential Modeling

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] A2P-Vis: an Analyzer-to-Presenter Agentic Pipeline for Visual Insights Generation and Reporting

[Paper] Introducing TrGLUE and SentiTurca: A Comprehensive Benchmark for Turkish General Language Understanding and Sentiment Analysis

[Paper] Unifying Learning Dynamics and Generalization in Transformers Scaling Law

[Paper] Context as a Tool: Context Management for Long-Horizon SWE-Agents