[Paper] Balancing Fidelity, Utility, and Privacy in Synthetic Cardiac MRI Generation: A Comparative Study
Source: arXiv - 2603.04340v1
Overview
Deep learning models for cardiac MRI (CMR) struggle with two big hurdles: there simply aren’t enough publicly‑available scans, and strict privacy laws limit data sharing. This paper evaluates three state‑of‑the‑art generative models—DDPM, Latent Diffusion, and Flow Matching—to see how well they can create realistic, useful, and privacy‑preserving synthetic CMR images.
Key Contributions
- Systematic benchmark of three generative families (diffusion‑based and flow‑based) on the same cardiac MRI dataset.
- Two‑stage conditioning pipeline that first generates anatomical masks and then synthesizes the full‑resolution image, mirroring clinical workflow.
- Multi‑dimensional evaluation framework covering image fidelity (visual quality), downstream utility (segmentation performance), and privacy leakage (membership inference attacks).
- Quantitative trade‑off analysis showing that DDPM strikes the best overall balance, while Flow Matching offers stronger privacy at a modest utility cost.
- Open‑source reference implementation and reproducible evaluation scripts for the community.
Methodology
- Data preparation – A modest CMR cohort (≈200 patients) is split into training, validation, and test sets. Anatomical masks (e.g., left‑ventricle, myocardium) are extracted using a pre‑trained segmentation network.
- Two‑stage generation
- Stage 1: A mask‑generator (conditional UNet) creates plausible cardiac masks from random noise.
- Stage 2: The mask conditions a high‑resolution image generator. Three generators are compared:
- DDPM – classic denoising diffusion with a UNet denoiser.
- Latent Diffusion Model (LDM) – diffusion operates in a compressed latent space for speed.
- Flow Matching (FM) – learns a continuous normalizing flow that directly maps noise to images.
- Evaluation metrics
- Fidelity: Fréchet Inception Distance (FID), Structural Similarity Index (SSIM), and visual Turing tests.
- Utility: Train a downstream segmentation model on synthetic data (or a mix of synthetic + real) and measure Dice scores on held‑out real scans.
- Privacy: Membership inference attacks and attribute disclosure tests to estimate how much patient‑specific information leaks.
Results & Findings
| Model | FID (↓) | SSIM (↑) | Segmentation Dice (↑) | Membership Attack Success (↓) |
|---|---|---|---|---|
| DDPM | 23.1 | 0.89 | 0.84 (vs. 0.88 real) | 52 % (near random) |
| LDM | 27.4 | 0.86 | 0.81 | 55 % |
| FM | 31.0 | 0.84 | 0.78 | 41 % (best privacy) |
- Fidelity: Diffusion models (DDPM > LDM) produce sharper, more anatomically accurate images than FM.
- Utility: Segmentation models trained on DDPM‑generated data achieve performance within 5 % of models trained on real data, confirming high downstream usefulness.
- Privacy: FM’s flow‑based approach leaks the least patient‑specific signal, but the drop in segmentation quality may be unacceptable for many clinical tasks.
- Data‑efficiency: All models retain reasonable performance even when trained on as few as 50 real scans, highlighting their suitability for scarce‑data environments.
Practical Implications
- Data augmentation for rare cardiac conditions – Developers can safely expand limited CMR datasets without exposing patient identities, accelerating model prototyping.
- Cross‑institutional collaborations – Synthetic datasets can be shared between hospitals or with cloud providers, sidestepping GDPR/HIPAA constraints.
- Rapid prototyping of AI pipelines – Teams can bootstrap segmentation or classification models using only synthetic scans, then fine‑tune on a small real subset for final validation.
- Regulatory compliance – The privacy evaluation framework offers a concrete metric that can be reported to ethics boards or auditors when releasing synthetic medical data.
- Tooling – The provided codebase integrates with popular frameworks (PyTorch, MONAI), making it easy to plug into existing training pipelines.
Limitations & Future Work
- Scope of anatomy – The study focuses on short‑axis cardiac MRI; extending to other views (e.g., long‑axis, 4‑D flow) may reveal new challenges.
- Evaluation of clinical realism – While quantitative metrics are strong, a larger panel of radiologists could provide deeper insight into subtle artefacts.
- Hybrid models – Combining diffusion’s fidelity with flow’s privacy (e.g., privacy‑aware diffusion) is an open research direction.
- Scalability – Diffusion models remain computationally intensive; future work should explore acceleration techniques (e.g., distillation, fewer diffusion steps).
Bottom line: For developers looking to enrich cardiac MRI datasets while respecting patient privacy, diffusion‑based synthetic generators—especially DDPM—offer the most practical balance of quality, utility, and safety today.
Authors
- Madhura Edirisooriya
- Dasuni Kawya
- Ishan Kumarasinghe
- Isuri Devindi
- Mary M. Maleckar
- Roshan Ragel
- Isuru Nawinne
- Vajira Thambawita
Paper Information
- arXiv ID: 2603.04340v1
- Categories: cs.CV, cs.LG
- Published: March 4, 2026
- PDF: Download PDF