[Paper] Analysis of Dirichlet Energies as Over-smoothing Measures

Published: 2 months ago (December 10, 2025 at 01:17 PM EST)

4 min read

Source: arXiv

Source: arXiv - 2512.09890v1

Overview

The paper Analysis of Dirichlet Energies as Over‑smoothing Measures investigates why two popular ways of measuring over‑smoothing in graph neural networks (GNNs) – the Dirichlet energy based on the unnormalized Laplacian and the one based on the normalized Laplacian – behave differently. By grounding the discussion in a formal “node‑similarity” axiom set, the authors show that the normalized version actually violates these axioms, which has direct consequences for how practitioners should monitor and control over‑smoothing in real‑world GNN pipelines.

Key Contributions

Axiomatic critique: Proves that the Dirichlet energy derived from the normalized Laplacian does not satisfy the node‑similarity axioms introduced by Rusch et al. (2022).
Spectral comparison: Provides a clean, side‑by‑side spectral analysis of the unnormalized vs. normalized Laplacian Dirichlet energies, exposing the root cause of their divergent behavior.
Guidelines for metric selection: Offers concrete criteria for choosing the “spectrally compatible” energy measure that aligns with a given GNN architecture (e.g., GCN, GraphSAGE, GAT).
Resolution of ambiguity: Clarifies why previous empirical studies sometimes reported contradictory over‑smoothing trends when switching between the two energies.

Methodology

Formal definition of node‑similarity – The authors adopt the axioms (non‑negativity, symmetry, identity of indiscernibles, and monotonicity under graph diffusion) from Rusch et al. to set a benchmark for any over‑smoothing metric.
Spectral decomposition – Both Laplacians are expressed in terms of their eigenvalues/eigenvectors. The Dirichlet energy of a node feature matrix (X) is written as
[ E_{\mathcal{L}}(X)=\operatorname{tr}(X^\top \mathcal{L} X). ]
Axiom testing – By plugging the normalized Laplacian (\mathcal{L}_{\text{norm}} = I - D^{-1/2} A D^{-1/2}) into the axioms, the authors construct counter‑examples (e.g., graphs with highly heterogeneous degree distributions) that break the monotonicity axiom.
Comparative experiments – Small synthetic graphs and standard benchmark datasets (Cora, PubMed, ogbn‑arxiv) are used to illustrate how the two energies evolve across layers of popular GNNs.

The approach stays high‑level enough for developers: think of it as “checking whether a smoothness score behaves like a proper distance” by looking at the eigen‑spectrum of the graph operator.

Results & Findings

Metric	Satisfies node‑similarity axioms?	Typical behavior in GNN layers
Unnormalized Dirichlet energy ((\mathcal{L}=D-A))	✅ Yes	Decreases monotonically as layers increase, cleanly reflecting over‑smoothing.
Normalized Dirichlet energy ((\mathcal{L}_{\text{norm}}))	❌ No (fails monotonicity)	Can increase after a few layers on irregular graphs, misleading developers about the true smoothness.

Key takeaways

The unnormalized energy is a reliable “over‑smoothing alarm” across a wide range of graph topologies.
Normalized energy’s dependence on degree scaling makes it sensitive to heterogeneity, causing spurious spikes that do not correspond to actual loss of discriminative power.

Practical Implications

Model debugging: When you see a sudden rise in the normalized Dirichlet energy during training, it may be a false positive. Switch to the unnormalized version for a trustworthy signal.
Architecture‑aware regularization: Many GNN regularizers (e.g., Laplacian smoothing, DropEdge) are derived from the unnormalized Laplacian. Aligning your over‑smoothing metric with the same operator avoids mismatched objectives.
Hyper‑parameter tuning: Early‑stopping criteria based on Dirichlet energy can now be implemented with confidence, using the unnormalized version to decide when a model has become too smooth.
Tooling: Libraries such as PyG or DGL can expose a simple torch.linalg.eigvals(L).real‑based utility that returns the unnormalized Dirichlet energy, making it a one‑liner in training loops.

Limitations & Future Work

The analysis is primarily theoretical and validated on relatively small benchmark graphs; scaling to massive, dynamic graphs (e.g., billions of nodes) may reveal additional nuances.
Only static GNN architectures are considered; extensions to temporal GNNs or attention‑based diffusion operators remain open.
The authors suggest exploring adaptive Laplacian choices (e.g., learned degree normalizations) that could combine the stability of the unnormalized energy with the scale‑invariance benefits of the normalized version.

Bottom line for developers: If you need a reliable, mathematically sound gauge of over‑smoothing in your GNN pipelines, stick with the Dirichlet energy built from the unnormalized graph Laplacian. It respects the core similarity axioms, behaves predictably across layers, and integrates seamlessly with existing regularization tricks.

Authors

Anna Bison
Alessandro Sperduti

Paper Information

arXiv ID: 2512.09890v1
Categories: cs.LG
Published: December 10, 2025
PDF: Download PDF

[Paper] Analysis of Dirichlet Energies as Over-smoothing Measures

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Particulate: Feed-Forward 3D Object Articulation

[Paper] A General Algorithm for Detecting Higher-Order Interactions via Random Sequential Additions

[Paper] Softmax as Linear Attention in the Large-Prompt Regime: a Measure-based Perspective

[Paper] Super Suffixes: Bypassing Text Generation Alignment and Guard Models Simultaneously