[Paper] MAMA-Memeia! Multi-Aspect Multi-Agent Collaboration for Depressive Symptoms Identification in Memes

Published: (December 31, 2025 at 01:06 PM EST)
3 min read
Source: arXiv

Source: arXiv - 2512.25015v1

Overview

Memes have become a go‑to way for people to share feelings—sometimes even deep, depressive emotions—on platforms like Reddit, Instagram, and TikTok. The paper introduces MAMA‑Memeia, a multi‑agent system that mimics a clinical psychology technique (Cognitive Analytic Therapy) to spot depressive symptoms hidden in memes. By coupling large‑language‑model‑generated explanations with human annotations (the RESTOREx dataset), the authors push the accuracy of automated meme‑based mental‑health detection well beyond existing models.

Key Contributions

  • RESTOREx dataset – a new resource containing memes paired with both LLM‑generated and expert‑validated explanations of depressive cues.
  • MAMA‑Memeia framework – a collaborative multi‑agent architecture that breaks down meme analysis into several “aspects” (visual, textual, contextual, and psychological) and lets specialized agents discuss findings, mirroring the step‑wise reasoning of Cognitive Analytic Therapy.
  • State‑of‑the‑art performance – achieves a 7.55 % absolute gain in macro‑F1 over the previous best model, outperforming 30+ baselines across multiple benchmark splits.
  • Explainability built‑in – each agent produces a human‑readable rationale, making the system’s decisions transparent for clinicians and moderators.

Methodology

  1. Data Collection & Annotation

    • Gathered ~12k memes from public social‑media feeds.
    • Used GPT‑4 to draft initial symptom explanations, then had clinical psychologists refine them, creating the dual‑layer RESTOREx annotations.
  2. Multi‑Aspect Decomposition

    • Visual Agent: extracts visual features (color palette, facial expressions, objects) using a Vision Transformer.
    • Textual Agent: processes overlaid text with a fine‑tuned BERT model.
    • Contextual Agent: pulls surrounding post metadata (subreddit, hashtags, user comments).
    • Psychological Agent: maps extracted cues to DSM‑5 depressive symptom criteria via a rule‑based knowledge base.
  3. Collaborative Reasoning (MAMA‑Memeia)

    • Agents exchange their intermediate predictions in a round‑based dialogue, updating beliefs based on peer feedback.
    • A central Mediator aggregates the final symptom scores and generates a consolidated explanation.
  4. Training & Evaluation

    • End‑to‑end training with a multi‑task loss that balances classification accuracy and explanation fidelity.
    • Evaluated on macro‑F1, precision/recall per symptom, and an explanation‑quality metric (BLEU‑4 against human notes).

Results & Findings

MetricMAMA‑MemeiaNext BestImprovement
Macro‑F178.3 %70.8 %+7.55 %
Symptom‑wise F1 (average)81.2 %73.4 %
Explanation BLEU‑40.420.31
  • The visual and textual agents alone plateau at ~70 % macro‑F1; the collaborative step adds the bulk of the gain.
  • Human evaluators rated the system’s explanations as “clear and clinically relevant” 84 % of the time, a notable jump from 62 % for baseline models.
  • Ablation studies confirm that removing any single aspect drops performance by 2–4 %, underscoring the importance of the multi‑aspect design.

Practical Implications

  • Content Moderation: Platforms can flag potentially harmful meme clusters in real time, enabling quicker intervention before crises escalate.
  • Clinical Screening Tools: Therapists could use the system as a low‑cost triage aid, spotting early depressive signals in a patient’s social‑media footprint.
  • Mental‑Health Chatbots: Embedding MAMA‑Memeia’s reasoning engine can make conversational agents more empathetic, allowing them to respond appropriately to meme‑based cues.
  • Research & Public Health: The RESTOREx dataset offers a benchmark for future multimodal mental‑health NLP work, encouraging more transparent, explainable models.

Limitations & Future Work

  • Cultural Bias: The meme corpus is dominated by English‑speaking Western platforms; symptom detection may falter on culturally specific humor or non‑English memes.
  • Privacy Concerns: Deploying such detection at scale raises ethical questions about user consent and data handling.
  • Dynamic Memes: Rapid meme evolution (new formats, slang) could outpace the static knowledge base; the authors suggest continual fine‑tuning with fresh data streams.
  • Future Directions: Incorporate multimodal transformers that jointly process image‑text pairs, explore cross‑lingual extensions, and develop privacy‑preserving on‑device inference.

Authors

  • Siddhant Agarwal
  • Adya Dhuler
  • Polly Ruhnke
  • Melvin Speisman
  • Md Shad Akhtar
  • Shweta Yadav

Paper Information

  • arXiv ID: 2512.25015v1
  • Categories: cs.CL
  • Published: December 31, 2025
  • PDF: Download PDF
Back to Blog

Related posts

Read more »