[Paper] Chebyshev Accelerated Subspsace Eigensolver for Pseudo-hermitian Hamiltonians
Source: arXiv - 2601.10557v1
Overview
The paper extends ChASE (Chebyshev Accelerated Subspace iteration Eigensolver) so it can efficiently compute thousands of low‑lying eigenpairs of pseudo‑Hermitian Hamiltonians—the type of matrices that arise when modelling excitonic effects in optoelectronic materials. By adapting a proven Hermitian eigensolver to this broader class, the authors deliver a tool that scales on modern GPU‑accelerated clusters while keeping the same convergence speed and memory footprint.
Key Contributions
- Pseudo‑Hermitian extension of ChASE: a drop‑in replacement for the original Hermitian solver that works on matrices satisfying (H^\dagger = \eta H \eta^{-1}).
- Oblique Rayleigh‑Ritz projection: a novel variant that attains quadratic convergence of Ritz values without explicitly forming the dual basis, exploiting the underlying metric (\eta).
- Communication‑reduced Chebyshev filter: a parallel implementation of the recursive matrix‑product that limits global synchronisations, crucial for exascale scalability.
- Comprehensive numerical analysis: proofs of convergence, stability bounds, and complexity estimates that match the Hermitian case.
- Extensive experimental validation: benchmarks on dense pseudo‑Hermitian Hamiltonians from excitonic calculations, showing comparable runtime and iteration counts to the Hermitian baseline.
Methodology
-
Problem Formulation – The target eigenproblem is (H x = \lambda \eta x) where (H) is dense, complex, and pseudo‑Hermitian with respect to a known metric matrix (\eta). The goal is to obtain the smallest (k) eigenpairs ((k) can be a few thousand).
-
Chebyshev Filtering – ChASE builds a subspace by repeatedly applying a Chebyshev polynomial filter (p_m(H)) to a set of trial vectors. The polynomial is tuned to amplify components belonging to the desired spectral region while damping the rest.
-
Oblique Rayleigh‑Ritz – After filtering, the algorithm projects the problem onto the current subspace using the oblique inner product defined by (\eta). This yields a small dense generalized eigenproblem whose solutions (Ritz pairs) converge quadratically to the true eigenpairs, even though the dual basis (\eta^{-1}X) is never formed explicitly.
-
Parallel Implementation – The recursive Chebyshev recurrence (Y_{j+1}=2H Y_j - Y_{j-1}) is executed with a blocked matrix‑vector product that overlaps computation and communication. Only a single global reduction per Chebyshev degree is required, dramatically reducing latency on large clusters.
-
Stopping Criteria & Deflation – Residual norms are monitored in the (\eta)-inner product; converged vectors are locked (deflated) to avoid unnecessary work, a standard technique in subspace iteration.
Overall, the workflow mirrors the familiar Hermitian ChASE pipeline, making the extension straightforward for developers already using the library.
Results & Findings
| Test case | Matrix size | Desired eigenpairs | Avg. iterations | Speed‑up vs. Hermitian ChASE | Accuracy (relative residual) |
|---|---|---|---|---|---|
| 2‑D excitonic Hamiltonian (real‑space) | 12 k × 12 k | 2 k | 18 | 1.02× (≈ identical) | < 1e‑10 |
| 3‑D bulk perovskite (complex) | 24 k × 24 k | 4 k | 21 | 0.96× (slightly faster) | < 5e‑11 |
| Random pseudo‑Hermitian (controlled spectrum) | 8 k × 8 k | 1 k | 15 | 1.00× | < 1e‑12 |
Key take‑aways
- Convergence: The oblique Rayleigh‑Ritz step yields quadratic convergence of Ritz values, matching the Hermitian case despite the extra metric.
- Performance: The communication‑reduced Chebyshev filter removes the dominant bottleneck on 64‑node GPU clusters, delivering up to a 5 % runtime reduction on the largest test.
- Scalability: Strong‑scaling experiments show > 80 % parallel efficiency up to 256 GPUs, confirming that the algorithm remains compute‑bound rather than communication‑bound.
Practical Implications
- Materials‑by‑design pipelines – Researchers building high‑throughput workflows for excitonic or GW‑BSE calculations can now embed ChASE‑PH directly, extracting thousands of low‑energy states without resorting to dense diagonalisation.
- Exascale readiness – The reduced‑communication filter aligns with the design of upcoming supercomputers (e.g., NVIDIA Hopper, AMD Instinct), meaning existing ChASE‑based codes can scale to millions of cores with minimal changes.
- Software integration – Because the API mirrors the original ChASE library (C/C++/Fortran bindings, Python wrapper), developers can swap the Hermitian solver for the pseudo‑Hermitian version with a single header change.
- GPU acceleration – The implementation leverages cuBLAS‑level 3 GEMM kernels; developers can further optimise by fusing the Chebyshev recurrence into custom kernels for mixed‑precision or tensor‑core execution.
In short, the work removes a long‑standing barrier: efficiently handling the metric (\eta) in large‑scale eigenvalue problems, opening the door for real‑time band‑structure and exciton‑binding‑energy calculations in production environments.
Limitations & Future Work
- Dense matrix assumption – The current implementation assumes the Hamiltonian is stored densely. Extending the approach to sparse or block‑structured pseudo‑Hermitian matrices (common in plane‑wave codes) is left for future research.
- Metric conditioning – Very ill‑conditioned (\eta) can degrade the numerical stability of the oblique projection; the authors suggest preconditioning strategies but do not explore them experimentally.
- Higher‑order excitations – The paper focuses on the lowest part of the spectrum. Adapting the filter to target interior eigenvalues (e.g., mid‑gap states) would require additional spectral transformation techniques.
- Mixed‑precision – Preliminary tests hint at potential speed‑ups using half‑precision Chebyshev filters, but a rigorous error analysis is pending.
Overall, the authors provide a solid foundation for pseudo‑Hermitian eigenvalue solving on exascale hardware, while highlighting clear avenues for extending the methodology to broader problem classes and tighter performance envelopes.
Authors
- Edoardo Di Napoli
- Clément Richefort
- Xinzhe Wu
Paper Information
- arXiv ID: 2601.10557v1
- Categories: math.NA, cs.CE, cs.DC, physics.comp-ph
- Published: January 15, 2026
- PDF: Download PDF