[Paper] Formal Analysis of the Sigmoid Function and Formal Proof of the Universal Approximation Theorem

Published: 1 month ago (December 3, 2025 at 05:16 AM EST)

4 min read

Source: arXiv

Source: arXiv - 2512.03635v1

Overview

The authors present a fully mechanized formalisation of the sigmoid activation function and a constructive proof of the Universal Approximation Theorem (UAT) inside Isabelle/HOL, a state‑of‑the‑art interactive theorem prover. By turning classic analysis of sigmoids into machine‑checked facts, they lay groundwork for formally verified neural‑network components—an increasingly important step toward trustworthy AI systems.

Key Contributions

Formal definition of the sigmoid function in Isabelle/HOL with proofs of monotonicity, smoothness, and closed‑form expressions for all higher‑order derivatives.
A mechanised, constructive proof of the Universal Approximation Theorem for feed‑forward networks using sigmoidal activations, showing they can uniformly approximate any continuous function on a compact interval.
Extension of Isabelle/HOL’s real‑analysis library: new tactics for limit reasoning, handling of piecewise‑defined functions, and bridging gaps in existing libraries.
Demonstration of a verification workflow that can be reused for other activation functions or network architectures.
Open‑source Isabelle theories released alongside the paper, enabling the community to build on the formalisation.

Methodology

Encoding the sigmoid – The authors introduce the classic logistic function σ(x) = 1 / (1 + e⁻ˣ) as a higher‑order logic constant, then prove basic analytical properties (boundedness, strict monotonicity, differentiability).
Higher‑order derivatives – Using Isabelle’s real_derivative infrastructure, they recursively derive formulas for σ⁽ⁿ⁾(x) and verify smoothness (C^∞) by induction on n.
Limit reasoning – They implement a lightweight limit calculus (ε‑δ style) that works seamlessly with Isabelle’s existing real analysis tools, allowing concise proofs of σ(x) → 0 as x → –∞ and σ(x) → 1 as x → +∞.
Constructive UAT proof – Rather than relying on non‑constructive existence arguments, the paper builds explicit shallow networks (single hidden layer) whose weights are derived from piecewise linear approximations of the target function. The sigmoid’s smoothness guarantees that these networks converge uniformly.
Mechanisation – All lemmas are checked by Isabelle’s automatic provers (Sledgehammer, auto, simp) and manually guided where needed, ensuring a fully verified proof chain.

Results & Findings

Verified properties: σ is shown to be infinitely differentiable with closed‑form derivative formulas, confirming the intuition used in back‑propagation implementations.
Constructive UAT: The formal proof yields explicit bounds on the number of hidden units required to achieve a given approximation error ε, linking network size to function smoothness.
Tool improvements: The new limit tactics reduce proof script length by ~30 % compared with previous Isabelle approaches, making future real‑analysis formalisation more ergonomic.
Reproducibility: All Isabelle theories compile without external axioms, demonstrating that the entire argument is self‑contained within the higher‑order logic framework.

Practical Implications

Verified AI components – Developers can now import a certified sigmoid library into safety‑critical systems (e.g., autonomous driving, medical diagnostics) and obtain machine‑checked guarantees about activation behaviour.
Network sizing guidelines – The constructive UAT provides a principled way to estimate hidden‑layer width for a desired approximation accuracy, useful for resource‑constrained edge deployments.
Foundation for further verification – With the sigmoid formalised, extending verification to full training pipelines (gradient descent correctness, loss‑function properties) becomes more tractable.
Cross‑tool integration – The Isabelle theories can be exported to code generators (e.g., Isabelle‑LLVM, Isabelle‑Haskell) to produce formally verified inference code, reducing the risk of implementation bugs.
Educational value – The proof scripts serve as concrete examples for teaching formal methods to ML engineers, bridging the gap between theoretical guarantees and practical code.

Limitations & Future Work

Scope limited to single‑hidden‑layer networks and compact intervals; deeper architectures and high‑dimensional input spaces are not yet covered.
Sigmoid‑only focus – While the logistic function is popular, many modern models use ReLU, GELU, or attention mechanisms; extending the library to these activations remains an open task.
Performance of generated code – The verified implementations prioritize correctness over runtime efficiency; optimizing the extracted code for production workloads is future work.
Integration with training dynamics – The current work verifies approximation capability but does not address convergence of learning algorithms; coupling the UAT proof with verified gradient descent is a natural next step.

Authors

Dustin Bryant
Jim Woodcock
Simon Foster

Paper Information

arXiv ID: 2512.03635v1
Categories: cs.LO, cs.SE
Published: December 3, 2025
PDF: Download PDF

[Paper] Formal Analysis of the Sigmoid Function and Formal Proof of the Universal Approximation Theorem

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] MicroRacer: Detecting Concurrency Bugs for Cloud Service Systems

[Paper] Executing Discrete/Continuous Declarative Process Specifications via Complex Event Processing

[Paper] Compiling Away the Overhead of Race Detection

[Paper] Automated Code Review Assignments: An Alternative Perspective of Code Ownership on GitHub