[Paper] Enhancing Imbalanced Node Classification via Curriculum-Guided Feature Learning and Three-Stage Attention Network

Published: (February 3, 2026 at 01:10 PM EST)
4 min read
Source: arXiv

Source: arXiv - 2602.03808v1

Overview

Imbalanced node classification—where some classes dominate the graph while others are scarce—remains a major obstacle for Graph Neural Networks (GNNs). The paper introduces CL3AN‑GNN, a curriculum‑guided, three‑stage attention architecture that mimics how humans learn from easy to hard concepts, dramatically improving performance on skewed graph data.

Key Contributions

  • Curriculum‑guided learning for GNNs: A systematic “easy‑to‑hard” training schedule that first focuses on simple, local patterns before tackling complex, multi‑hop relationships.
  • Three‑stage attention mechanism (Engage → Enact → Embed)
    • Engage – isolates easy features (1‑hop neighborhoods, low‑degree nodes, class‑separable pairs).
    • Enact – adaptively re‑weights harder signals (multi‑step connections, heterophilic edges, minority‑class fringe nodes).
    • Embed – consolidates all learned representations via iterative message passing and curriculum‑aligned loss weighting.
  • Curriculum‑aligned loss weighting: Dynamically adjusts the contribution of each stage to the overall loss, stabilizing training under severe label skew.
  • Extensive empirical validation: Tested on eight Open Graph Benchmark (OGB) datasets covering social, biological, and citation networks, achieving consistent gains in accuracy, macro‑F1, and AUC over the latest baselines.
  • Interpretability tools: Gradient‑stability and attention‑correlation visualizations that expose how the model’s focus shifts across curriculum stages.

Methodology

  1. Feature Pre‑selection (Engage)

    • Compute initial node embeddings with a shallow GCN and a GAT.
    • Identify “easy” nodes: those with low degree, strong local homophily, and clear class separation (via cosine similarity of embeddings).
    • Feed only these easy features into the first attention block, allowing the network to learn a stable base representation.
  2. Adaptive Hard‑Example Emphasis (Enact)

    • Introduce a second attention layer that assigns higher weights to:
      • Multi‑hop neighborhoods (capturing long‑range dependencies).
      • Heterophilic edges (links between different classes).
      • Nodes on the periphery of minority classes (often mis‑classified).
    • The attention scores are learned jointly with the node embeddings, enabling the model to “focus” where it matters most.
  3. Iterative Consolidation (Embed)

    • A final attention‑driven message‑passing stage aggregates the refined features from Engage and Enact.
    • The loss function is split into stage‑specific components, each multiplied by a curriculum weight that gradually shifts emphasis from Engage → Enact → Embed as training progresses.
  4. Training Pipeline

    • Early epochs: high weight on Engage loss → stable convergence on easy patterns.
    • Mid epochs: increase Enact weight → the model starts to correct hard examples.
    • Late epochs: dominant Embed loss → fine‑tune the full representation for final classification.

The overall pipeline is lightweight (no extra parameters beyond standard GNN layers) and can be dropped into existing GNN stacks.

Results & Findings

Dataset (OGB)Baseline (e.g., GraphSMOTE)CL3AN‑GNNΔ AccuracyΔ Macro‑F1
ogbn‑arxiv71.4 %74.9 %+3.5 %+4.2 %
ogbn‑products62.1 %66.0 %+3.9 %+5.0 %
ogbn‑proteins68.7 %71.5 %+2.8 %+3.6 %
… (5 more)
  • Consistent gains across all eight benchmarks in accuracy, macro‑F1, and AUC.
  • Faster convergence: CL3AN‑GNN reaches 90 % of its final performance in ~30 % fewer epochs compared to end‑to‑end baselines.
  • Robustness to unseen imbalance: When the class distribution is artificially skewed further, the curriculum‑trained model degrades far less than competing methods.
  • Interpretability: Attention heatmaps show a clear transition from focusing on local neighborhoods (early stage) to long‑range, heterophilic edges (later stage), matching the curriculum design.

Practical Implications

  • Better minority‑class detection for fraud, rare disease gene prediction, or niche recommendation systems without needing costly oversampling or synthetic node generation.
  • Plug‑and‑play upgrade: Since CL3AN‑GNN builds on standard GCN/GAT layers, developers can integrate it into existing PyTorch‑Geometric or DGL pipelines with a few lines of code.
  • Reduced training time: The curriculum schedule stabilizes early learning, meaning fewer epochs and lower GPU hours—valuable for large‑scale industrial graphs.
  • Explainable GNN decisions: Stage‑wise attention visualizations can be exposed to end users or auditors to justify why a model flagged a node as belonging to a rare class.
  • Transferability: The curriculum framework can be adapted to other graph tasks (link prediction, graph classification) where data imbalance is a concern.

Limitations & Future Work

  • Curriculum design heuristics: The current “easy‑to‑hard” criteria (degree, 1‑hop homophily, embedding separability) are hand‑crafted; learning these criteria automatically could further improve adaptability.
  • Scalability to billion‑node graphs: While the method adds minimal overhead, the extra attention passes may still be a bottleneck for ultra‑large graphs; distributed implementations are needed.
  • Heterophily beyond two hops: The Enact stage focuses on multi‑step connections up to a fixed radius; extending to dynamic radii or graph‑level reasoning is an open direction.
  • Broader curriculum schedules: Exploring non‑linear or reinforcement‑learning‑driven curriculum pacing could yield even faster convergence.

Overall, CL3AN‑GNN offers a compelling, developer‑friendly recipe for tackling class imbalance in graph‑structured data, marrying curriculum learning principles with modern attention‑based GNNs.

Authors

  • Abdul Joseph Fofanah
  • Lian Wen
  • David Chen
  • Shaoyang Zhang

Paper Information

  • arXiv ID: 2602.03808v1
  • Categories: cs.LG, cs.AI
  • Published: February 3, 2026
  • PDF: Download PDF
Back to Blog

Related posts

Read more »