[Paper] TAGFN: A Text-Attributed Graph Dataset for Fake News Detection in the Age of LLMs

Published: 2 months ago (November 26, 2025 at 12:49 PM EST)

4 min read

Source: arXiv

Source: arXiv - 2511.21624v1

Overview

The paper introduces TAGFN, a new, large‑scale text‑attributed graph dataset built specifically for fake‑news detection. By coupling rich textual content with graph structure (e.g., social‑media interactions, article citations), TAGFN gives researchers a realistic benchmark to test both classic graph‑based outlier detectors and the newest Large Language Model (LLM)‑enhanced approaches.

Key Contributions

A first‑of‑its‑kind dataset for graph‑outlier detection in the fake‑news domain, containing millions of nodes, edges, and high‑quality annotations.
Unified evaluation framework that supports traditional graph algorithms, graph neural networks (GNNs), and LLM‑augmented models under the same experimental protocol.
Fine‑tuning pipeline for adapting LLMs (e.g., GPT‑4, LLaMA) to the fake‑news detection task using the graph’s textual attributes.
Open‑source release of the dataset (via Hugging Face) and accompanying code, encouraging reproducibility and community contributions.

Methodology

Data collection – The authors harvested news articles, their metadata, and the social‑graph of user interactions from multiple public platforms (e.g., Twitter, Reddit). Each article becomes a node with a text attribute (the article body) and metadata attributes (publisher, timestamp, etc.). Edges capture relationships such as “shared by the same user,” “cites,” or “replies to.”
Annotation – Articles were labeled as real or fake using verified fact‑checking sources (e.g., PolitiFact, Snopes). The labeling process was semi‑automated and then manually audited to ensure high precision.
Graph construction – A heterogeneous graph is built where different edge types are preserved, enabling models to learn from both structural patterns (e.g., echo‑chamber clusters) and textual cues.
Benchmark design – The dataset is split into training/validation/test sets for supervised learning and also provides an unsupervised outlier‑detection split where only a small fraction of nodes are labeled.
Baseline implementations – The authors evaluate classic outlier detectors (e.g., LOF, Isolation Forest), GNN‑based methods (e.g., GraphSAGE, GAT), and LLM‑enhanced pipelines that concatenate node embeddings from a frozen LLM with graph embeddings.

Results & Findings

Model	Setting	ROC‑AUC	Precision@100	Comment
Isolation Forest (features only)	Unsupervised	0.71	0.42	Struggles without graph context
GraphSAGE	Supervised	0.84	0.68	Gains from structural cues
GAT + Text Embedding (BERT)	Supervised	0.88	0.73	Attention over neighbors helps
LLM‑Fine‑Tuned (LLaMA‑7B) + GraphSAGE	Supervised	0.92	0.81	LLM provides richer semantic signals
LLM‑Zero‑Shot Prompting	Unsupervised	0.78	0.55	Competitive without any fine‑tuning

LLM‑augmented models consistently outperform pure graph or pure text baselines, confirming that large‑scale language understanding adds value to graph‑based outlier detection.
Unsupervised LLM prompting (e.g., “Is this article likely fake?”) already beats many classic detectors, showing promise for low‑resource scenarios.
The heterogeneous edge types (user‑share vs. citation) contribute differently; user‑share edges are the strongest signal for clustering fake news.

Practical Implications

Misinformation pipelines: Companies building real‑time fact‑checking tools can plug TAGFN‑trained models into their content moderation stacks, leveraging both social‑graph dynamics and article semantics.
LLM fine‑tuning for domain‑specific safety: The provided fine‑tuning scripts let developers adapt any open‑source LLM to detect fake news with minimal labeled data, reducing reliance on costly human annotation.
Graph‑aware recommendation systems: Platforms can use the outlier scores to down‑rank or flag suspicious content before it spreads, improving user trust.
Benchmark for research & product teams: TAGFN offers a reproducible testbed for evaluating new GNN architectures, contrastive learning on graphs, or prompt‑engineering strategies for LLMs in the misinformation space.

Limitations & Future Work

Temporal bias: The dataset captures a snapshot of news from a specific period; models may degrade as topics and manipulation tactics evolve.
Platform coverage: While Twitter and Reddit are well represented, other channels (e.g., private messaging apps) are missing, limiting generalizability.
Label noise: Even with fact‑checking sources, some borderline cases remain ambiguous, potentially affecting supervised training.
Scalability of LLM fine‑tuning: Fine‑tuning large models (≥13B parameters) still demands substantial GPU resources, which may be prohibitive for smaller teams.

Future directions suggested by the authors include extending TAGFN with temporal edges, incorporating multilingual news, and exploring prompt‑tuning techniques that require fewer compute resources while retaining LLM‑level performance.

If you’re interested in experimenting with TAGFN, the dataset and code are ready to clone from Hugging Face and GitHub. Dive in, and you might be the next to push the frontier of trustworthy AI in the fight against fake news.

Authors

Kay Liu
Yuwei Han
Haoyan Xu
Henry Peng Zou
Yue Zhao
Philip S. Yu

Paper Information

arXiv ID: 2511.21624v1
Categories: cs.SI, cs.CL
Published: November 26, 2025
PDF: Download PDF

[Paper] TAGFN: A Text-Attributed Graph Dataset for Fake News Detection in the Age of LLMs

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] ThetaEvolve: Test-time Learning on Open Problems

[Paper] MegaChat: A Synthetic Persian Q&A Dataset for High-Quality Sales Chatbot Evaluation

[Paper] Ambiguity Awareness Optimization: Towards Semantic Disambiguation for Direct Preference Optimization

[Paper] Is Passive Expertise-Based Personalization Enough? A Case Study in AI-Assisted Test-Taking

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] ThetaEvolve: Test-time Learning on Open Problems

[Paper] MegaChat: A Synthetic Persian Q&amp;A Dataset for High-Quality Sales Chatbot Evaluation

[Paper] Ambiguity Awareness Optimization: Towards Semantic Disambiguation for Direct Preference Optimization

[Paper] Is Passive Expertise-Based Personalization Enough? A Case Study in AI-Assisted Test-Taking

[Paper] MegaChat: A Synthetic Persian Q&A Dataset for High-Quality Sales Chatbot Evaluation