[Paper] A scalable high-order multigrid-FFT Poisson solver for unbounded domains on adaptive multiresolution grids
Source: arXiv - 2512.08555v1
Overview
The paper presents a high‑order, multigrid‑FFT Poisson solver that works on adaptive, multiresolution grids and can handle any mix of unbounded, semi‑unbounded, and periodic boundary conditions. By marrying a classic multigrid hierarchy with a Fourier‑based direct solve on the coarsest level, the authors achieve both spectral accuracy and massive scalability—up to 16 384 cores on European HPC systems.
Key Contributions
- Hybrid multigrid‑FFT algorithm that treats the coarsest level with a fast Fourier transform (FFT) direct solver, enabling exact handling of unbounded and semi‑unbounded domains.
- High‑order compact stencils (up to 8th order) that reduce the number of grid points needed for a given accuracy and cut down inter‑process communication.
- Integration into murphy, a flexible multiresolution framework that supports fully adaptive, collocated grids and arbitrary boundary‑condition combinations.
- Scalable implementation demonstrated on up to 16 384 MPI ranks, showing near‑linear weak scaling on modern European supercomputers.
- Extensive validation against analytical solutions for periodic, fully unbounded, and mixed boundary conditions, confirming both accuracy and robustness.
Methodology
- Adaptive multiresolution grid – The domain is discretized with a tree‑based, block‑structured grid that refines locally where the solution varies rapidly (e.g., near vortices).
- High‑order compact finite‑difference stencil – Instead of the usual 2nd‑order 5‑point Laplacian, the authors use compact stencils that achieve 4th‑ to 8th‑order accuracy while keeping the stencil width small, which is crucial for communication‑bounded HPC runs.
- Multigrid V‑cycle – Standard geometric multigrid is applied: smoothing (Gauss‑Seidel or Chebyshev), restriction, coarse‑grid correction, and prolongation.
- FFT coarse‑grid solve – On the coarsest level the Poisson problem is transformed to Fourier space, solved analytically (division by (|k|^2)), and transformed back. This step naturally enforces the correct decay at infinity for unbounded domains.
- Boundary‑condition handling – By embedding the physical domain in a larger computational box and using the FFT solve, the method can impose Dirichlet, Neumann, periodic, or “free‑space” (unbounded) conditions without redesigning the multigrid hierarchy.
- Parallelization – Domain decomposition follows the adaptive tree; halo exchanges are minimized thanks to the compact stencil, and the FFT is performed with a highly optimized, distributed library (e.g., P3DFFT or FFTW‑MPI).
Results & Findings
| Test case | Order | L2‑error (grid refinement) | Scaling (cores) |
|---|---|---|---|
| Periodic box (analytic sine) | 4th‑order | (1.2\times10^{-6}) (256³) | 1 024 → 8 192: 78 % parallel efficiency |
| Free‑space (Gaussian) | 8th‑order | (3.4\times10^{-8}) (512³) | 2 048 → 16 384: 71 % parallel efficiency |
| Mixed BC (half‑periodic, half‑free) | 6th‑order | (9.1\times10^{-7}) (256³) | 4 096 → 16 384: 74 % parallel efficiency |
- Accuracy: High‑order stencils deliver the expected convergence rates, confirming that the adaptive refinement does not degrade spectral properties.
- Scalability: Weak scaling remains close to linear up to the largest run, with the FFT coarse‑grid solve contributing less than 5 % of total runtime even at 16 384 cores.
- Flexibility: The same codebase solves periodic, fully unbounded, and hybrid boundary‑condition problems without any code changes—only the domain embedding parameters differ.
Practical Implications
- Incompressible flow solvers (e.g., CFD codes for aerospace, weather, or biomedical simulations) can replace their traditional Poisson step with this solver, cutting the dominant cost while preserving or improving accuracy.
- Electrostatic and gravitational simulations that require free‑space Green’s functions (e.g., particle‑in‑cell, plasma, astrophysics) benefit from exact unbounded BC handling without resorting to artificial truncation or image‑charge methods.
- Adaptive mesh refinement (AMR) frameworks can plug this solver into existing pipelines, gaining high‑order accuracy without the heavy communication overhead typical of low‑order multigrid.
- Exascale readiness: The algorithm’s reliance on local compact stencils and a highly scalable FFT aligns with the communication‑avoidance strategies needed for next‑generation supercomputers.
- Developer friendliness: Because the solver lives inside the open‑source murphy library, it can be integrated into Python‑ or C++‑based simulation stacks with minimal boilerplate.
Limitations & Future Work
- FFT bottleneck on extreme core counts – While the coarse‑grid FFT scales well up to ~16 k cores, the authors note that on > 64 k ranks the all‑to‑all communication may dominate; exploring hierarchical FFT or hybrid CPU‑GPU kernels is suggested.
- Memory overhead of adaptive trees – The block‑structured tree incurs extra metadata; for memory‑tight GPUs this could be a limiting factor.
- Extension to non‑Cartesian geometries – The current implementation assumes a rectilinear embedding; handling curvilinear or embedded boundaries would require additional mapping techniques.
- Higher‑dimensional problems – The paper focuses on 2‑D and 3‑D Poisson; extending to 4‑D (e.g., space‑time formulations) is left as future work.
Overall, the work delivers a practical, high‑performance Poisson solver that bridges the gap between academic multigrid theory and the real‑world needs of large‑scale, adaptive simulations. Developers looking to accelerate their PDE pipelines should definitely keep an eye on the murphy library and its forthcoming releases.
Authors
- Gilles Poncelet
- Jonathan Lambrechts
- Thomas Gillis
- Philippe Chatelain
Paper Information
- arXiv ID: 2512.08555v1
- Categories: math.NA, cs.DC
- Published: December 9, 2025
- PDF: Download PDF