[Paper] Free-RBF-KAN: Kolmogorov-Arnold Networks with Adaptive Radial Basis Functions for Efficient Function Learning

Published: 1 week ago (January 12, 2026 at 12:45 PM EST)

4 min read

Source: arXiv

Source: arXiv - 2601.07760v1

Overview

The paper introduces Free‑RBF‑KAN, a new variant of Kolmogorov‑Arnold Networks (KANs) that swaps the traditional B‑spline basis for adaptive radial basis functions (RBFs). By letting the RBF centers, widths, and smoothness parameters be learned directly from data, the authors achieve the same approximation power as classic KANs while cutting training and inference time—an attractive proposition for anyone building high‑performance, low‑latency ML models.

Key Contributions

Adaptive RBF Grid: Unlike fixed RBF placements, the network learns a “free” grid of RBF centers and scales, aligning the basis with the data’s activation patterns.
Trainable Smoothness Parameter: Smoothness is treated as a kernel hyper‑parameter and optimized jointly with weights, removing the need for manual tuning.
Universality Proof for RBF‑KANs: The authors extend the theoretical foundation of KANs, showing that any continuous multivariate function can be approximated arbitrarily well with the proposed RBF formulation.
Efficiency Gains: Empirical benchmarks demonstrate faster forward/backward passes compared with B‑spline‑based KANs, without extra memory overhead.
Broad Experimental Validation: Experiments span multiscale function fitting, physics‑informed neural networks (PINNs), and learning solution operators for PDEs, confirming both accuracy and speed benefits.

Methodology

Network Architecture – A KAN decomposes a multivariate function into a sum of univariate “inner” functions followed by a multivariate “outer” function. Free‑RBF‑KAN replaces each inner univariate function with a weighted sum of Gaussian RBFs:

[ f_i(x) = \sum_{k=1}^{K} w_{ik},\phi\bigl(\alpha_{ik}(x - c_{ik})\bigr) ]

where (c_{ik}) (center), (\alpha_{ik}) (inverse width), and a global smoothness scalar (\beta) are all learnable.
Adaptive Grid Learning – During back‑propagation, gradients flow not only to the linear weights (w_{ik}) but also to the centers (c_{ik}) and scales (\alpha_{ik}). This lets the basis “morph” to match the data distribution, effectively providing a data‑driven resolution grid.
Smoothness as a Kernel Parameter – The Gaussian kernel is modified to (\phi_{\beta}(z)=\exp(-\beta z^2)). The scalar (\beta) is optimized jointly, allowing the network to automatically trade off smoothness vs. sharpness.
Training Pipeline – The authors use standard stochastic gradient descent (Adam) with weight decay. No special regularizers are required; the adaptive parameters are naturally constrained by the loss gradient.
Theoretical Guarantee – By constructing a dense set of RBFs and leveraging the Kolmogorov‑Arnold representation theorem, they prove that Free‑RBF‑KAN can approximate any continuous function on a compact domain to arbitrary precision.

Results & Findings

Task	Metric (lower is better)	B‑spline KAN	Free‑RBF‑KAN	Speedup (train / infer)
Multiscale 1‑D function	MSE	1.2e‑4	1.1e‑4	1.8× / 2.1×
PINN for Burgers’ equation	Relative L2 error	3.5e‑3	3.3e‑3	1.6× / 1.9×
PDE operator (Navier‑Stokes)	MAE	4.8e‑3	4.7e‑3	1.5× / 1.7×

Accuracy: Free‑RBF‑KAN matches or slightly improves the original KAN across all benchmarks, confirming that adaptive RBFs close the performance gap observed in earlier RBF‑KAN attempts.
Efficiency: By eliminating the costly De Boor recursion required for B‑splines, the new model reduces both FLOPs and memory traffic, yielding roughly 1.5–2× faster training and inference.
Scalability: Experiments up to 64‑dimensional input spaces show stable convergence, indicating that the adaptive grid does not explode combinatorially.

Practical Implications

Faster Prototyping – Developers can swap a B‑spline KAN for Free‑RBF‑KAN with a drop‑in code change and immediately see speed gains, especially valuable in edge‑device or real‑time inference scenarios.
Adaptive Resolution for Scientific ML – In physics‑informed models where solution features (e.g., shocks) are localized, the learnable RBF grid automatically concentrates basis functions where they’re needed, reducing the manual engineering of mesh refinements.
Low‑Memory Deployments – Since RBFs are parameter‑efficient (no knot vectors), model size stays comparable to classic KANs, making the approach suitable for mobile or embedded AI stacks.
Plug‑and‑Play with Existing Frameworks – The authors provide a PyTorch implementation that integrates with standard nn.Module pipelines, meaning existing training loops, optimizers, and mixed‑precision utilities work out‑of‑the‑box.

Limitations & Future Work

Hyper‑parameter Sensitivity – While smoothness is learned, the initial number of RBFs per inner function still needs to be chosen; too few can limit expressivity, too many can increase training time.
Gradient Stability – Learning centers and widths jointly can lead to occasional “collapse” where multiple RBFs converge to the same location; the authors mitigate this with small learning‑rate schedules but a more robust regularizer could help.
Extension to Non‑Gaussian Kernels – The paper focuses on Gaussian RBFs; exploring other kernels (e.g., Matérn, compact‑support) could further improve performance on specific domains.
Theoretical Tightness – The universality proof guarantees approximation in the limit; tighter bounds on required RBF count for a given error tolerance remain an open question.

Bottom line: Free‑RBF‑KAN offers a practical, high‑performance alternative to classic KANs, delivering the same expressive power with a leaner computational footprint—an appealing tool for developers building next‑generation function‑approximation models, from scientific simulators to real‑time AI services.

Authors

Shao‑Ting Chiu
Siu Wun Cheung
Ulisses Braga‑Neto
Chak Shing Lee
Rui Peng Li

Paper Information

arXiv ID: 2601.07760v1
Categories: cs.LG, math.NA
Published: January 12, 2026
PDF: Download PDF

[Paper] Free-RBF-KAN: Kolmogorov-Arnold Networks with Adaptive Radial Basis Functions for Efficient Function Learning

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Do explanations generalize across large reasoning models?

[Paper] Building Production-Ready Probes For Gemini

[Paper] ShapeR: Robust Conditional 3D Shape Generation from Casual Captures

[Paper] MetaboNet: The Largest Publicly Available Consolidated Dataset for Type 1 Diabetes Management