[Paper] Matrix: Peer-to-Peer Multi-Agent Synthetic Data Generation Framework

Published: (November 26, 2025 at 01:59 PM EST)
2 min read
Source: arXiv

Source: arXiv - 2511.21686v1

Overview

Synthetic data has become increasingly important for training large language models, especially when real data is scarce, expensive, or privacy‑sensitive. Many such generation tasks require coordinated multi‑agent workflows, where specialized agents collaborate to produce data that is higher quality, more diverse, and structurally richer. However, existing frameworks for multi‑agent synthesis often depend on a centralized orchestrator, creating scalability bottlenecks, or are hardcoded for specific domains, limiting flexibility.

We present Matrix, a decentralized framework that represents both control and data flow as serialized messages passed through distributed queues. This peer‑to‑peer design eliminates the central orchestrator. Each task progresses independently through lightweight agents, while compute‑intensive operations, such as LLM inference or containerized environments, are handled by distributed services. Built on Ray, Matrix scales to tens of thousands of concurrent agentic workflows and provides a modular, configurable design that enables easy adaptation to a wide range of data generation workflows.

We evaluate Matrix across diverse synthesis scenarios, such as multi‑agent collaborative dialogue, web‑based reasoning data extraction, and tool‑use trajectory generation in customer service environments. In all cases, Matrix achieves $2$–$15\times$ higher data generation throughput under identical hardware resources, without compromising output quality.

Authors

  • Dong Wang
  • Yang Li
  • Ansong Ni
  • Ching‑Feng Yeh
  • Youssef Emad
  • Xinjie Lei
  • Liam Robbins
  • Karthik Padthe
  • Hu Xu
  • Xian Li
  • Asli Celikyilmaz
  • Ramya Raghavendra
  • Lifei Huang
  • Carole‑Jean Wu
  • Shang‑Wen Li

Categories

  • cs.CL
  • cs.AI
  • cs.LG

Paper Information

  • arXiv ID: 2511.21686v1
  • Published: November 27, 2025
  • PDF: Download PDF
Back to Blog

Related posts

Read more »