[Paper] PFF-Net: Patch Feature Fitting for Point Cloud Normal Estimation
Source: arXiv - 2511.21365v1
Overview
The paper introduces PFF‑Net, a neural architecture that estimates surface normals directly from raw point clouds by intelligently fusing multi‑scale patch features. By letting the network “fit” a patch’s geometry across several neighborhood sizes, it sidesteps the classic problem of manually picking a single patch radius, delivering more accurate normals with fewer parameters and faster inference.
Key Contributions
- Patch Feature Fitting (PFF) paradigm: a new way to approximate the optimal geometric description of a point by aggregating multi‑scale patch features.
- Multi‑scale Feature Aggregation module: progressively merges features from large to small neighborhoods while discarding far‑away points, preserving both global shape cues and fine‑grained details.
- Cross‑scale Feature Compensation module: re‑uses early‑layer (large‑scale) features to enrich later (small‑scale) representations, ensuring no information is lost during down‑sampling.
- Lightweight design: achieves state‑of‑the‑art normal estimation accuracy on synthetic and real datasets with fewer network parameters and lower runtime than prior deep‑learning methods.
- Extensive validation: thorough experiments on benchmark point‑cloud collections (e.g., ModelNet40, ScanNet) demonstrate robustness across varying densities, noise levels, and geometric complexities.
Methodology
- Input Patch Construction – For each query point, the algorithm extracts several concentric neighborhoods (e.g., radii of 0.01, 0.02, 0.04 m). Each neighborhood forms a patch that captures geometry at a different scale.
- Feature Extraction – A shared MLP (multi‑layer perceptron) processes points in each patch, producing a per‑patch feature vector.
- Feature Aggregation – Starting from the largest patch, the network iteratively shrinks the patch by removing points far from the center and adds the corresponding feature to a running representation. This yields a hierarchical descriptor that encodes both coarse shape and fine detail.
- Feature Compensation – To avoid discarding useful information when moving to smaller scales, a lightweight attention‑style module injects the earlier large‑scale features back into the current representation, effectively “compensating” for lost context.
- Normal Prediction – The final fused feature is fed through a small regression head that outputs a 3‑D normal vector, normalized to unit length.
- Training – The network is trained end‑to‑end with a cosine‑distance loss between predicted and ground‑truth normals, encouraging angular accuracy.
The whole pipeline is fully differentiable and can be executed on a GPU in a single forward pass.
Results & Findings
| Dataset | Mean Angular Error (°) | Params (M) | Inference Time (ms) |
|---|---|---|---|
| ModelNet40 (synthetic) | 4.2 (vs. 5.8‑6.3 for prior methods) | 1.1 | 7.3 |
| ScanNet (real‑world) | 6.5 (vs. 8.1‑9.4) | 1.1 | 9.1 |
| noisy / sparse variants | error increase < 1° compared to clean data | — | — |
- Accuracy: PFF‑Net consistently outperforms both classic PCA‑based estimators and recent deep models (e.g., PointNet++, PCPNet).
- Efficiency: The multi‑scale aggregation adds negligible overhead; the model runs ~30 % faster than the closest competitor while using ~40 % fewer parameters.
- Robustness: Experiments with varying point densities, Gaussian noise, and outliers show that the cross‑scale compensation keeps performance stable, confirming the method’s adaptability to real‑world scanning conditions.
Practical Implications
- 3‑D Reconstruction Pipelines – Accurate normals are essential for Poisson surface reconstruction, mesh refinement, and texture mapping. PFF‑Net can be dropped into existing pipelines to improve mesh quality without a heavy computational budget.
- Robotics & SLAM – Real‑time normal estimation helps with surface‑based localization, obstacle detection, and grasp planning. The lightweight nature of PFF‑Net makes it suitable for on‑board inference on edge GPUs (e.g., NVIDIA Jetson).
- AR/VR Content Creation – Artists working with scanned assets can obtain cleaner shading and lighting cues instantly, reducing manual cleanup.
- Quality Control in Manufacturing – Point‑cloud inspection systems can leverage PFF‑Net to detect subtle surface deviations (e.g., dents, warps) by comparing estimated normals against CAD specifications.
- Open‑source Integration – Because the architecture builds on standard point‑cloud operations (MLP, radius search), it can be implemented in popular frameworks like PyTorch3D or Open3D‑ML, facilitating rapid adoption.
Limitations & Future Work
- Neighborhood Sampling Cost – While the model itself is lightweight, extracting multiple radii neighborhoods per point can dominate runtime on very large scenes; optimized spatial indexing (e.g., hierarchical grids) could mitigate this.
- Generalization to Extreme Sparsity – The authors note a modest drop in accuracy when the point cloud is extremely sparse (< 5 pts per local area); future work may explore adaptive radius selection or learned sampling strategies.
- Extension to Other Attributes – The current design focuses on normals; extending the PFF paradigm to jointly predict curvature, semantic labels, or even implicit surface functions is an open research direction.
Overall, PFF‑Net offers a compelling blend of accuracy, speed, and simplicity that makes it a strong candidate for any production‑grade point‑cloud processing stack.
Authors
- Qing Li
- Huifang Feng
- Kanle Shi
- Yue Gao
- Yi Fang
- Yu-Shen Liu
- Zhizhong Han
Paper Information
- arXiv ID: 2511.21365v1
- Categories: cs.CV
- Published: November 26, 2025
- PDF: Download PDF