AI Look-Alike Search for OF Creators — Need Advice on Better Face Models
Source: Dev.to
What I’m Building (Quick Overview)
- Users upload an image (reference photo / celebrity image)
- The system finds OF models with similar facial characteristics
- Results are ranked using face embeddings + vector similarity search
- Everything currently runs on CPU, but a move to GPU is being considered for scale and experimentation
What I’m Building (More Detail)
The system allows users to upload an image and receive a list of OF models with similar facial characteristics.
The intent is to support visual discovery, where perceived similarity matters more than exact identity matching.
Key Constraints
- Similarity over identity – ranking by perceived similarity, not strict identity verification
- Low tolerance for false positives – returning visually different faces as “similar” is more harmful than missing a potential match
- Real‑world images – dataset consists of non‑studio images with varying lighting, poses, resolutions, and overall quality
- Scalability – solution needs to scale beyond 100 k+ images without significant drops in accuracy or performance
Current Pipeline (CPU‑Based)
- Face detection and alignment
- Feature extraction using a pre‑trained face model
- Storing embeddings in a vector index
- Nearest‑neighbor search using cosine similarity
At this scale the system works reasonably well, but both accuracy and performance are becoming limiting factors.
Current Model Setup (InsightFace)
Face embeddings are generated using InsightFace, specifically the buffalo_l model bundle.
- Face detection and alignment via InsightFace
- Feature extraction using the
buffalo_lmodel - Embeddings stored for similarity search
- Cosine similarity for ranking similar faces
Provides a solid baseline, but for look‑alike matching small inaccuracies are very noticeable.
Where the System Struggles
- Visually similar faces sometimes rank lower than expected
- Different individuals with shared facial traits can appear as false positives
- Lighting, pose, and image quality introduce noise
- CPU inference becomes a bottleneck during re‑indexing and experimentation
Because this is a look‑alike use case, even small errors can significantly affect perceived quality.
CPU vs GPU — Is the Move Worth It?
I’m planning to migrate the pipeline to GPU‑based inference, but want to ensure the model choice justifies the move.
- Which face models provide the best results for visual similarity, not identity recognition?
- Does GPU inference unlock meaningfully better accuracy, or is it mainly a speed improvement?
- Are there models that are simply not practical to run on CPU at this scale?
If I’m going to reprocess 100 k+ OF model images, I want to do it with the right model.
What I’m Looking for in a Better Face Model
- Produce high‑quality embeddings for similarity search
- Perform well on non‑ideal, real‑world images
- Scale efficiently beyond 100 k images
- Benefit from GPU acceleration
- Can be fine‑tuned (or perform well out of the box) for look‑alike matching
Open to both open‑source and commercial solutions.
Real‑World Context
This work is part of a discovery platform where users can upload an image and find visually similar OF models using AI‑based face similarity.
The project is called Explore.Fans, and face similarity search is one of its core components.
Questions for the Community
If you’ve worked with face similarity or face recognition models at scale, I’d appreciate your input:
- Which models gave you the best results for look‑alike similarity?
- Did GPU inference improve accuracy, or mostly performance?
- Any experience fine‑tuning models for similarity‑based ranking?
- Anything you’d avoid based on real‑world experience?
Thanks in advance — happy to share more details if helpful.
References
InsightFace (DeepInsight)