[Paper] MemDreamer: Decoupling Perception and Reasoning for Long Video Understanding via Hierarchical Graph Memory and Agentic Retrieval Mechanism

Published: (June 5, 2026 at 01:59 PM EDT)
2 min read
Source: arXiv

Source: arXiv - 2606.07512v1

Overview

Current Vision-Language Models struggle with hours-long videos because processing full-length visual sequences induces prohibitive token explosion and attention dilution. To overcome this, we introduce MemDreamer to decouple perception and reasoning, shifting long-video understanding into an agentic exploration process. As a plug-and-play framework, it incrementally streams videos to construct a Hierarchical Graph Memory, a top-down three-tier architecture for semantic abstraction, anchored by a foundational graph capturing spatiotemporal and causal relations. During inference, the reasoning model employs agentic tool-augmented retrieval, navigating hierarchies, searching nodes, and traversing logical edges via an Observation-Reason-Action loop. Experiments show MemDreamer achieves SOTA results across four mainstream benchmarks, narrowing the gap with human experts to only 3.7 points. It constrains the reasoning context window to merely 2% of full-context ingestion while delivering a 12.5 point absolute accuracy gain. Furthermore, statistical analysis uncovers a strong positive linear correlation between an VLM’s performance on logic reasoning and long-video understanding benchmarks, establishing agentic capability scaling as a new paradigm for multimodal comprehension.

Key Contributions

This paper presents research in the following areas:

  • cs.CV
  • cs.AI
  • cs.CL

Methodology

Please refer to the full paper for detailed methodology.

Practical Implications

This research contributes to the advancement of cs.CV.

Authors

  • Cong Chen
  • Guo Gan
  • Kaixiang Ji
  • ChaoYang Zhang
  • Zhen Yang
  • Guangming Yao
  • Hao Chen
  • Jingdong Chen
  • Yi Yuan
  • Chunhua Shen

Paper Information

  • arXiv ID: 2606.07512v1
  • Categories: cs.CV, cs.AI, cs.CL
  • Published: June 5, 2026
  • PDF: Download PDF
0 views
Back to Blog

Related posts

Read more »