[Paper] DPSR: Differentially Private Sparse Reconstruction via Multi-Stage Denoising for Recommender Systems
Source: arXiv - 2512.18932v1
Overview
A new paper proposes DPSR (Differentially Private Sparse Reconstruction), a three‑stage denoising pipeline that lets recommender systems keep user data private and deliver higher‑quality recommendations. By exploiting the natural sparsity, low‑rank structure, and collaborative patterns of rating matrices, DPSR reduces the usual privacy‑utility penalty and even beats a non‑private baseline on several benchmarks.
Key Contributions
- Three‑stage post‑processing framework that works after differential‑privacy noise is added, preserving DP guarantees via the post‑processing immunity theorem.
- Information‑theoretic noise calibration that injects less noise into high‑information entries (e.g., popular items) while still respecting the global privacy budget.
- Collaborative‑filtering denoiser that uses item‑item similarity graphs to cancel out a large portion of the injected noise.
- Low‑rank matrix‑completion step that recovers latent user/item factors, further cleaning both privacy noise and inherent data noise.
- Empirical gains of 5.5 %–9.2 % lower RMSE than the best Laplace/Gaussian DP baselines across ε ∈ [0.1, 10], with statistically significant improvements (p < 0.05).
- Surprising regularization effect: at ε = 1.0 DPSR attains an RMSE (0.9823) better than the non‑private model (1.0983), showing that the denoising pipeline also removes natural data noise.
Methodology
- Noise Injection (DP guarantee) – The original rating matrix is perturbed with calibrated Laplace or Gaussian noise according to a chosen privacy budget ε. This step alone would normally degrade recommendation quality.
- Stage 1 – Information‑Theoretic Calibration – Before adding noise, the algorithm estimates the information content of each rating (e.g., based on item popularity or user activity). High‑information entries receive a smaller noise scale, while low‑information ones get the full budget. This adaptive scaling keeps the overall ε budget intact but concentrates privacy protection where it matters most.
- Stage 2 – Collaborative‑Filtering Denoising – After noise injection, a similarity matrix between items is built (e.g., cosine similarity on the noisy vectors). For each rating, a weighted average of its neighbors’ noisy values is computed, effectively smoothing out random perturbations while preserving genuine collaborative signals.
- Stage 3 – Low‑Rank Matrix Completion – The partially denoised matrix is fed into a standard low‑rank factorization/completion algorithm (e.g., alternating least squares or nuclear‑norm minimization). Because rating data are inherently low‑rank, this step recovers the latent user and item factors, eliminating residual noise and filling missing entries.
- Post‑Processing Immunity – All three stages are pure post‑processing; they do not access the raw data, so the overall pipeline remains ε‑differentially private by definition.
Results & Findings
| Privacy Budget (ε) | Baseline (non‑private) RMSE | Laplace/Gaussian DP RMSE | DPSR RMSE | % Improvement vs. DP |
|---|---|---|---|---|
| 0.1 | 1.0983 | 1.2154 | 1.1021 | 9.2 % |
| 0.5 | 1.0983 | 1.0457 | 0.9893 | 5.5 % |
| 1.0 | 1.0983 | 0.9972 | 0.9823 | 1.5 % (and beats non‑private) |
| 5.0 | 1.0983 | 0.9451 | 0.9104 | 3.7 % |
| 10.0 | 1.0983 | 0.9256 | 0.8872 | 4.2 % |
- All improvements are statistically significant (p < 0.05, most p < 0.001).
- The denoising pipeline acts as a regularizer, removing not only injected privacy noise but also the stochastic noise present in real‑world rating data.
- Experiments on synthetic datasets with known ground truth confirm that DPSR consistently recovers the latent low‑rank structure more accurately than competing DP mechanisms.
Practical Implications
- Better user experience: Platforms can now offer more accurate recommendations while still providing strong privacy guarantees, reducing churn caused by poor personalization.
- Regulatory compliance made easier: Companies subject to GDPR, CCPA, or upcoming AI‑privacy laws can adopt DPSR to meet ε‑DP requirements without sacrificing service quality.
- Plug‑and‑play component: Since DPSR is a post‑processing layer, it can be inserted into existing pipelines that already use Laplace or Gaussian DP noise—no need to redesign the whole recommender architecture.
- Resource‑efficient: The three stages rely on well‑studied algorithms (similarity computation, collaborative filtering, low‑rank matrix factorization) that are already optimized in many ML libraries, making deployment feasible at scale.
- Potential for cross‑domain use: Any system that works with sparse, low‑rank data (e.g., implicit feedback, social graphs, knowledge bases) could benefit from the same denoising‑after‑DP pattern.
Limitations & Future Work
- Synthetic focus: The current evaluation uses synthetic rating matrices; real‑world datasets (e.g., MovieLens, Amazon) with complex bias patterns remain to be tested.
- Scalability of Stage 2: Computing dense item‑item similarity on very large catalogs can be costly; approximate nearest‑neighbor methods or graph‑sampling techniques may be required.
- Fixed privacy budget: DPSR assumes a single global ε. Extending the framework to support personalized privacy budgets (per‑user or per‑item) is an open direction.
- Robustness to adversarial attacks: While DP protects against statistical leakage, the denoising steps could inadvertently amplify certain attacks (e.g., model‑inversion). Formal robustness analysis is needed.
- Integration with deep‑learning recommenders: Adapting DPSR to neural collaborative filtering or transformer‑based recommenders could unlock further gains but requires careful handling of gradient‑based privacy accounting.
Bottom line: DPSR demonstrates that clever post‑processing can turn the privacy‑utility trade‑off from a hard limit into a tunable engineering problem, opening the door for privacy‑first recommender systems that don’t compromise on relevance.
Authors
- Sarwan Ali
Paper Information
- arXiv ID: 2512.18932v1
- Categories: cs.LG, cs.CR
- Published: December 22, 2025
- PDF: Download PDF