[Paper] Formal Analysis of the Sigmoid Function and Formal Proof of the Universal Approximation Theorem

Published: (December 3, 2025 at 05:16 AM EST)
4 min read
Source: arXiv

Source: arXiv - 2512.03635v1

Overview

The authors present a fully mechanized formalisation of the sigmoid activation function and a constructive proof of the Universal Approximation Theorem (UAT) inside Isabelle/HOL, a state‑of‑the‑art interactive theorem prover. By turning classic analysis of sigmoids into machine‑checked facts, they lay groundwork for formally verified neural‑network components—an increasingly important step toward trustworthy AI systems.

Key Contributions

  • Formal definition of the sigmoid function in Isabelle/HOL with proofs of monotonicity, smoothness, and closed‑form expressions for all higher‑order derivatives.
  • A mechanised, constructive proof of the Universal Approximation Theorem for feed‑forward networks using sigmoidal activations, showing they can uniformly approximate any continuous function on a compact interval.
  • Extension of Isabelle/HOL’s real‑analysis library: new tactics for limit reasoning, handling of piecewise‑defined functions, and bridging gaps in existing libraries.
  • Demonstration of a verification workflow that can be reused for other activation functions or network architectures.
  • Open‑source Isabelle theories released alongside the paper, enabling the community to build on the formalisation.

Methodology

  1. Encoding the sigmoid – The authors introduce the classic logistic function σ(x) = 1 / (1 + e⁻ˣ) as a higher‑order logic constant, then prove basic analytical properties (boundedness, strict monotonicity, differentiability).
  2. Higher‑order derivatives – Using Isabelle’s real_derivative infrastructure, they recursively derive formulas for σ⁽ⁿ⁾(x) and verify smoothness (C^∞) by induction on n.
  3. Limit reasoning – They implement a lightweight limit calculus (ε‑δ style) that works seamlessly with Isabelle’s existing real analysis tools, allowing concise proofs of σ(x) → 0 as x → –∞ and σ(x) → 1 as x → +∞.
  4. Constructive UAT proof – Rather than relying on non‑constructive existence arguments, the paper builds explicit shallow networks (single hidden layer) whose weights are derived from piecewise linear approximations of the target function. The sigmoid’s smoothness guarantees that these networks converge uniformly.
  5. Mechanisation – All lemmas are checked by Isabelle’s automatic provers (Sledgehammer, auto, simp) and manually guided where needed, ensuring a fully verified proof chain.

Results & Findings

  • Verified properties: σ is shown to be infinitely differentiable with closed‑form derivative formulas, confirming the intuition used in back‑propagation implementations.
  • Constructive UAT: The formal proof yields explicit bounds on the number of hidden units required to achieve a given approximation error ε, linking network size to function smoothness.
  • Tool improvements: The new limit tactics reduce proof script length by ~30 % compared with previous Isabelle approaches, making future real‑analysis formalisation more ergonomic.
  • Reproducibility: All Isabelle theories compile without external axioms, demonstrating that the entire argument is self‑contained within the higher‑order logic framework.

Practical Implications

  • Verified AI components – Developers can now import a certified sigmoid library into safety‑critical systems (e.g., autonomous driving, medical diagnostics) and obtain machine‑checked guarantees about activation behaviour.
  • Network sizing guidelines – The constructive UAT provides a principled way to estimate hidden‑layer width for a desired approximation accuracy, useful for resource‑constrained edge deployments.
  • Foundation for further verification – With the sigmoid formalised, extending verification to full training pipelines (gradient descent correctness, loss‑function properties) becomes more tractable.
  • Cross‑tool integration – The Isabelle theories can be exported to code generators (e.g., Isabelle‑LLVM, Isabelle‑Haskell) to produce formally verified inference code, reducing the risk of implementation bugs.
  • Educational value – The proof scripts serve as concrete examples for teaching formal methods to ML engineers, bridging the gap between theoretical guarantees and practical code.

Limitations & Future Work

  • Scope limited to single‑hidden‑layer networks and compact intervals; deeper architectures and high‑dimensional input spaces are not yet covered.
  • Sigmoid‑only focus – While the logistic function is popular, many modern models use ReLU, GELU, or attention mechanisms; extending the library to these activations remains an open task.
  • Performance of generated code – The verified implementations prioritize correctness over runtime efficiency; optimizing the extracted code for production workloads is future work.
  • Integration with training dynamics – The current work verifies approximation capability but does not address convergence of learning algorithms; coupling the UAT proof with verified gradient descent is a natural next step.

Authors

  • Dustin Bryant
  • Jim Woodcock
  • Simon Foster

Paper Information

  • arXiv ID: 2512.03635v1
  • Categories: cs.LO, cs.SE
  • Published: December 3, 2025
  • PDF: Download PDF
Back to Blog

Related posts

Read more »