[Paper] PaperDebugger: A Plugin-Based Multi-Agent System for In-Editor Academic Writing, Review, and Editing

Published: 2 months ago (December 2, 2025 at 05:00 AM EST)

4 min read

Source: arXiv

Source: arXiv - 2512.02589v1

Overview

PaperDebugger is a Chrome extension that turns LaTeX editors (e.g., Overleaf) into a collaborative workspace for large‑language‑model (LLM) agents. By embedding the assistant directly in the editor, authors can invoke context‑aware reasoning, automated literature look‑ups, and structured reviews without leaving the document. The authors demonstrate that this tight integration is feasible and already useful for real‑world academic writing pipelines.

Key Contributions

In‑editor multi‑agent architecture – LLM agents run side‑by‑side inside the editor, accessing the current LaTeX source, revision history, and bibliography.
Plugin‑based extensibility – A lightweight Chrome extension loads “plugins” (search, citation lookup, scoring, etc.) that can be swapped or combined at runtime.
Model Context Protocol (MCP) – A custom protocol that synchronizes fine‑grained document patches, version control, and secure state between the editor UI and a Kubernetes‑backed orchestration layer.
Parallel agent scheduling – The system can launch several agents concurrently (e.g., one to suggest wording, another to verify references) and merge their diffs back into the document.
Real‑world demo & analytics – A public demo shows end‑to‑end editing, review, and revision cycles, while early usage metrics confirm active engagement from researchers.

Methodology

Chrome Extension Front‑end – The extension injects a minimal UI into Overleaf (or any web‑based LaTeX editor). It captures the current document snapshot and forwards it to the back‑end via WebSocket.
Kubernetes Orchestration – Each LLM request spawns a containerized agent. The orchestrator handles load‑balancing, scaling, and isolation, ensuring that multiple agents can run simultaneously without interfering.
Model Context Protocol (MCP) – MCP defines a bidirectional message format:
- State sync: editor ↔︎ agent sends the full source tree and a diff‑based patch log.
- Command channel: UI triggers actions like “search literature”, “check citation”, or “rewrite paragraph”.
- Security: Sensitive tokens and model parameters stay on the server side; only sanitized diffs travel to the client.
Plugin System – Plugins are small Python/Node modules that implement a standard MCP handler. For example, the “Reference Lookup” plugin queries arXiv/Semantic Scholar, while the “Document Scorer” plugin runs a fine‑tuned LLM to assign a quality score.
Evaluation – The authors collected interaction logs from a beta group (≈30 users, 2 weeks) and measured latency, edit acceptance rate, and overall satisfaction via surveys.

Results & Findings

Metric	Value	Interpretation
Avg. round‑trip latency (editor ↔︎ agent)	1.2 s	Fast enough for interactive use
Patch acceptance rate (edits kept by authors)	68 %	Majority of suggestions are useful
Parallel agent speed‑up	2.3× vs. sequential execution	Parallelism yields noticeable productivity gains
User‑reported satisfaction (1‑5)	4.2	Positive reception among early adopters
Citation‑retrieval accuracy	92 % (top‑3 results)	Reliable literature search integration

These numbers indicate that an in‑editor LLM assistant can operate with low latency, produce high‑quality suggestions, and scale to multiple concurrent agents without degrading the user experience.

Practical Implications

Reduced context switching – Researchers no longer need to copy‑paste text into external chatbots; the assistant works on the exact LaTeX source, preserving macros, custom commands, and bibliography files.
Automated literature curation – The “search” plugin can pull up relevant papers, auto‑generate BibTeX entries, and even suggest where to cite them, cutting hours of manual bibliography work.
Continuous quality scoring – Real‑time document scoring helps teams monitor draft maturity, making it easier to decide when a manuscript is ready for submission.
Team‑wide collaborative editing – Because the system tracks version diffs, multiple co‑authors can invoke agents concurrently, and the orchestrator merges edits safely—ideal for large research groups.
Extensible ecosystem – Organizations can develop proprietary plugins (e.g., compliance checks for specific journals) and drop them into the same UI, fostering a marketplace of writing‑assistant tools.

For developers, the open‑source repo provides a ready‑made Chrome extension, a Docker‑compose‑compatible back‑end, and clear MCP specifications, making it straightforward to prototype new agents or integrate existing LLM services.

Limitations & Future Work

Editor dependency – Currently limited to web‑based LaTeX editors; native desktop tools (e.g., TeXstudio) are not supported.
LLM cost & latency spikes – Heavy models can increase response times; the authors plan to add adaptive model selection based on task complexity.
Security & privacy – While MCP sanitizes data, transmitting full source files to a remote server may raise concerns for confidential research; future versions aim for on‑premise orchestration.
User study scale – The evaluation involved a modest user pool; larger, multi‑disciplinary studies are needed to generalize findings.

The authors envision expanding the plugin ecosystem, adding support for other scientific writing formats (Word, Markdown), and exploring tighter integration with version‑control platforms like GitHub.

Authors

Junyi Hou
Andre Lin Huikai
Nuo Chen
Yiwei Gong
Bingsheng He

Paper Information

arXiv ID: 2512.02589v1
Categories: cs.AI, cs.SE
Published: December 2, 2025
PDF: Download PDF

[Paper] PaperDebugger: A Plugin-Based Multi-Agent System for In-Editor Academic Writing, Review, and Editing

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Enhancing Retrieval-Augmented Generation with Entity Linking for Educational Platforms

[Paper] Training-Time Action Conditioning for Efficient Real-Time Chunking

[Paper] Whatever Remains Must Be True: Filtering Drives Reasoning in LLMs, Shaping Diversity

[Paper] AQUA-Net: Adaptive Frequency Fusion and Illumination Aware Network for Underwater Image Enhancement