[Paper] Belief Propagation Converges to Gaussian Distributions in Sparsely-Connected Factor Graphs
Source: arXiv - 2601.21935v1
Overview
The paper proves that Belief Propagation (BP)—a cornerstone algorithm for distributed probabilistic inference—naturally drives node beliefs toward Gaussian shapes when run on large, sparsely‑connected factor graphs. By leveraging the Central Limit Theorem, the authors give a solid theoretical footing for the widespread empirical success of Gaussian Belief Propagation (GBP) in domains such as computer vision, robotics, and sensor networks, even when the underlying problem is far from Gaussian.
Key Contributions
- Theoretical guarantee: Shows that, under four mild assumptions, the marginal beliefs produced by standard (non‑Gaussian) BP converge to Gaussian distributions as the number of BP iterations grows.
- CLT‑based proof: Adapts the Central Limit Theorem to the message‑passing dynamics of loopy factor graphs, handling the dependencies introduced by cycles.
- Practical validation: Demonstrates the Gaussian convergence on a real‑world stereo depth estimation task, observing near‑Gaussian marginals after only a few iterations.
- Guidelines for practitioners: Provides a checklist of graph properties (sparsity, bounded degree, etc.) that make GBP a safe approximation in spatial AI pipelines.
- Open‑source reference implementation: Releases a lightweight Python/NumPy library that reproduces the experiments and can be plugged into existing BP frameworks.
Methodology
- Problem setting – The authors consider a generic factor graph (G = (V, F, E)) where variables (x_i) interact through factors (f_a). The graph is sparse (average degree bounded) and may contain many loops, reflecting typical spatial AI structures (e.g., pixel‑wise depth variables linked by local smoothness factors).
- Assumptions – Four conditions are required:
- Bounded factor degree (each factor touches only a constant number of variables).
- Finite second moments for all local potentials.
- Uniformly bounded message variances across iterations.
- A “mixing” condition ensuring that each variable receives messages from a growing number of independent sources as BP proceeds.
- CLT adaptation – By treating each incoming message to a variable as an independent random contribution, the authors show that the sum of these contributions (the log‑belief) satisfies the Lindeberg–Feller version of the CLT, converging to a normal distribution.
- Iterative analysis – They prove that the convergence holds after a finite number of BP updates, not only asymptotically, thanks to the sparsity‑induced independence.
- Empirical testbed – A stereo depth estimation pipeline is built: raw pixel disparities are modeled with a non‑Gaussian likelihood, while smoothness factors enforce local consistency. The BP algorithm is run for 5–10 iterations, and the resulting marginal histograms are fitted to Gaussians to quantify the convergence.
Results & Findings
- Gaussian convergence speed: In the stereo experiment, the Kullback–Leibler divergence between the empirical belief and its best‑fit Gaussian drops by >80 % after just three BP iterations.
- Robustness to non‑Gaussian priors: Even when the observation model is heavy‑tailed (e.g., Laplace noise), the resulting beliefs still become Gaussian under the stated graph conditions.
- Scalability: The proof holds for graphs with millions of variables, confirming that the result is not a small‑scale artifact.
- Comparison with GBP: Running a pure Gaussian BP (which forces Gaussian messages from the start) yields virtually identical final marginals, validating GBP as a near‑optimal shortcut in these settings.
Practical Implications
- Confidence in GBP: Developers can safely replace full‑distribution BP with the far cheaper Gaussian version in large‑scale SLAM, depth estimation, or sensor‑fusion systems, knowing that the approximation error is theoretically bounded.
- Memory & compute savings: Gaussian messages require only mean and variance, cutting memory footprints by >90 % and enabling real‑time inference on edge devices (e.g., drones, AR glasses).
- Simplified algorithm design: Engineers can design factor graphs that respect the four assumptions (e.g., keep factor arity low, avoid dense all‑to‑all connections) to guarantee Gaussian convergence, turning a theoretical result into a design checklist.
- Hybrid pipelines: For parts of a model that violate the assumptions (high‑degree factors, strongly multimodal priors), a mixed approach can be used—run full BP locally and switch to GBP elsewhere—without sacrificing overall accuracy.
- Tooling: The provided Python library can be dropped into existing PyTorch or JAX pipelines, offering a drop‑in Gaussian message‑passing layer that automatically tracks convergence diagnostics.
Limitations & Future Work
- Assumption strictness: The proof requires bounded factor degree and uniformly bounded message variances; highly dense factor graphs (e.g., fully connected CRFs) fall outside the guarantee.
- Non‑stationary data: The CLT argument assumes a static graph; extending the result to dynamically changing factor structures (common in online SLAM) remains open.
- Higher‑order moments: While means and variances converge, the paper does not quantify how quickly skewness or kurtosis vanish, which could matter for downstream decision thresholds.
- Empirical breadth: Experiments focus on a single stereo depth task; future work should validate the theory across other domains such as natural‑language parsing, recommender systems, or large‑scale Bayesian networks.
Bottom line: If you’re building a spatial‑AI system that relies on belief propagation, this paper gives you both the theoretical confidence and practical guidelines to lean on Gaussian Belief Propagation—turning a heuristic shortcut into a provably sound engineering choice.
Authors
- Tom Yates
- Yuzhou Cheng
- Ignacio Alzugaray
- Danyal Akarca
- Pedro A. M. Mediano
- Andrew J. Davison
Paper Information
- arXiv ID: 2601.21935v1
- Categories: cs.DC
- Published: January 29, 2026
- PDF: Download PDF