Building a Pet Insurance Comparison Engine: Handling Variable Premiums Across 15 French Insurers
Source: Dev.to
Introduction
French pet insurance has grown 34 % since 2022, driven by rising veterinary costs and increased pet ownership post‑COVID. Comparing products programmatically is a nightmare: 15 major insurers, each with its own pricing grid based on species, breed, age, region, and deductible. Below is an overview of how I built a comparison engine that handles this complexity.
Data Sources
| Insurer | Data Format | Notes |
|---|---|---|
| Santévet | JSON API (unofficial, scraped from their quote widget) | Real‑time quotes |
| Assurimo | PDF tariff grids (updated quarterly) | Requires PDF parsing |
| Groupama | Static tables by risk category | Simple lookup |
| Dalma | Dynamic pricing engine (quote request required) | Needs API call |
| … | … | … |
Normalization Schema
A unified JSON schema covers ~80 % of cases:
{
"insurer_id": "santevet",
"product_id": "sv-excellence",
"species": "cat",
"breed_risk_group": 2,
"age_min_months": 12,
"age_max_months": 84,
"region": "IDF",
"deductible_pct": 20,
"ceiling_annual_eur": 3000,
"monthly_premium_eur": 38.50,
"reimbursement_basis": "actual_costs",
"waiting_period_days": 30
}
The remaining 20 % (hereditary conditions, breed exclusions, complementary modules) are captured with an exclusions array and an add_ons object.
Breed‑to‑Risk Mapping
French insurers use different classification systems:
- Santévet – 4 risk groups (1 = low, 4 = high)
- Assurimo – 7 categories by morphology
- Allianz – breed whitelist / blacklist
I built a cross‑walk table that maps 380 dog breeds and 90 cat breeds to a normalized 5‑level risk scale, using the Fédération Cynologique Internationale (FCI) breed standards as the anchor.
Regional Cost Adjustments
Veterinary costs vary significantly by region (e.g., a consultation costs €28 in rural Creuse vs €68 in Paris 16th). Insurers adjust premiums in different ways:
- By department code
- By ZIP‑code prefix
- By an urban/rural flag only
Solution: a geolocation lookup table mapping INSEE commune codes to risk tiers, updated annually from the DREES veterinary care cost survey.
Quote Retrieval Architecture
For insurers with quote APIs, a queue‑based system is used:
- User input triggers parallel quote requests across all insurers.
- Each request has a 3‑second timeout.
- Missing quotes fall back to cached tariff data (max 30 days old) and are flagged visually in the UI.
The live engine (covering 12 insurers with real‑time quotes and 3 with cached grids) can be tested at .
Freshness & Confidence Scoring
French law requires insurers to notify policyholders of premium changes 15 days before renewal. This creates a “freshness” problem for comparison sites.
Confidence score per premium record:
$$ \text{score} = 1 - \frac{\text{days_since_update}}{90} $$
- Records older than 90 days are excluded from comparisons and flagged for manual refresh.
Implementation Tips
- Start with the PDF parser – most insurers still distribute tariffs as PDFs, and building a reliable extractor took 3× longer than expected.
- Document the exclusions schema early – adding hereditary‑condition support retroactively broke four normalizers.
- Build an insurer‑change‑detection webhook – instead of polling, subscribe to insurer sitemap changes.
Regulatory Considerations
The insurance sector imposes accuracy obligations (ACPR regulations), adding compliance overhead. Ensure that:
- Premiums are refreshed within the legal notification window.
- All displayed data includes provenance and freshness indicators.
Call to Action
Have you built comparison engines in regulated industries? The insurance sector has unique challenges around accuracy and compliance. Feel free to discuss your experiences in the comments.