[Paper] Enhancing Authorship Attribution with Synthetic Paintings
Source: arXiv - 2603.04343v1
Overview
The paper investigates whether synthetic paintings generated with modern text‑to‑image diffusion models can fill the data gap that has long hampered AI‑driven authorship attribution. By fine‑tuning Stable Diffusion via DreamBooth on a handful of real works, the authors create realistic “fake” paintings and blend them with the original dataset. The resulting hybrid training set boosts the accuracy of classifiers that try to guess who painted a given artwork—an advance that could make AI‑based art authentication more reliable in real‑world, data‑scarce scenarios.
Key Contributions
- Synthetic data pipeline: Demonstrates how DreamBooth‑fine‑tuned Stable Diffusion can produce high‑fidelity paintings that preserve the stylistic nuances of a target artist.
- Hybrid training strategy: Introduces a straightforward recipe for mixing real and synthetic images to improve downstream classification performance.
- Empirical validation: Shows consistent gains in ROC‑AUC and overall accuracy across multiple artist‑pair experiments, confirming that synthetic samples act as effective regularizers.
- Open‑source reproducibility: Provides code, model checkpoints, and a curated dataset split, enabling other researchers and developers to replicate and extend the work.
Methodology
- Data collection: The authors start with a modest corpus of digitized paintings (≈ 200–300 per artist) from public museum archives.
- DreamBooth fine‑tuning: For each artist, a Stable Diffusion model is fine‑tuned on the artist’s real works plus a few textual prompts that capture the artist’s name and style. This yields a personalized generator capable of producing new images that “look like” the artist.
- Synthetic image generation: The fine‑tuned model generates 1–2 k synthetic paintings per artist, using diverse prompts to encourage variation while staying within the learned style distribution.
- Hybrid dataset assembly: Real and synthetic images are combined in several ratios (e.g., 1:1, 1:2) to form training sets; a held‑out real‑only test set remains untouched for evaluation.
- Classification model: A standard ResNet‑50 backbone (pre‑trained on ImageNet) is fine‑tuned on the hybrid sets to predict the artist label.
- Evaluation metrics: ROC‑AUC, top‑1 accuracy, and confusion matrices are reported to assess both discriminative power and generalization.
The pipeline is deliberately simple—no exotic architectures or adversarial training—so developers can plug it into existing computer‑vision workflows with minimal friction.
Results & Findings
| Training set | ROC‑AUC | Top‑1 Accuracy |
|---|---|---|
| Real only | 0.78 | 71 % |
| Real + Synthetic (1:1) | 0.86 | 78 % |
| Real + Synthetic (1:2) | 0.84 | 76 % |
- Adding synthetic paintings consistently lifts ROC‑AUC by ~8–9 percentage points.
- The best performance appears when synthetic data roughly matches the amount of real data (1:1 ratio), suggesting diminishing returns beyond that point.
- Error analysis shows that the classifier becomes less prone to over‑fitting on idiosyncratic brush‑stroke artifacts present only in the limited real set.
In short, synthetic images act as a powerful data‑augmentation tool, improving both discrimination and robustness.
Practical Implications
- Art market & authentication services: Companies that verify provenance can augment scarce provenance‑verified images with synthetic ones, reducing false‑positives/negatives without needing massive labeled collections.
- Cultural heritage digitization: Museums digitizing small collections can train reliable style‑recognition models without waiting for large crowdsourced labeling efforts.
- Developer tooling: The approach can be wrapped into a plug‑and‑play library—feed a few labeled images, get a synthetic generator, and train a classifier—all using familiar PyTorch or TensorFlow APIs.
- Beyond paintings: The same hybrid strategy could be applied to any domain where data is scarce but generative models exist (e.g., historical document classification, medical imaging with synthetic lesions).
Limitations & Future Work
- Synthetic realism ceiling: While DreamBooth captures high‑level style, subtle material cues (canvas texture, craquelure) are still missing, which may matter for forensic experts.
- Artist similarity bias: The method works best when the target artists have distinct visual vocabularies; highly overlapping styles may still confuse the classifier.
- Scalability to many artists: Fine‑tuning a separate diffusion model per artist becomes costly as the number of classes grows. Future work could explore multi‑artist conditioning or latent‑space interpolation to share generators.
- Human evaluation: The paper relies on quantitative metrics; a user study with art historians would strengthen claims about “authentic‑looking” synthetic works.
Overall, the study opens a practical pathway for developers to leverage generative AI as a data‑augmentation ally in the niche but high‑stakes field of artwork authorship attribution.
Authors
- Clarissa Loures
- Caio Hosken
- Luan Oliveira
- Gianlucca Zuin
- Adriano Veloso
Paper Information
- arXiv ID: 2603.04343v1
- Categories: cs.CV, cs.LG
- Published: March 4, 2026
- PDF: Download PDF