Python Guide: How to Detect If a Domain Is a Scam
Source: Dev.to
Introduction
Shopping online and signing up for new websites are everyday activities, but so is stumbling across scam domains. These shady sites may take your money, steal sensitive info, or vanish after operating for only a few weeks. This guide shows how to use Python to automatically screen websites for scam signals, explains why each check matters, provides a working script, and teaches you how to interpret the results.
Why scam domains are hard to spot
Scammers can create a slick web store or fake landing page in minutes. Common tricks include:
- Cheap or free domains registered in the last year, often just weeks ago
- WHOIS privacy shields (e.g., WhoisGuard, DomainsByProxy) that hide the owner’s identity
- No real email setup—just a web form, if that
- Broken or missing HTTPS
- Aggressive sales or big discounts to lure impulse buyers
- Almost no “real” policy pages, social proof, or company footprint
Many legitimate startups show some of these signals at first, of course. The more red flags you spot together, the higher the risk.
Technical signals that matter most
| Signal | Why it matters |
|---|---|
| Domain age – was the website registered in the last few months? | Most scams use brand‑new domains. |
| WHOIS privacy – owner hides behind privacy services | Makes verification difficult. |
| No MX record – lack of public email setup | Real businesses usually have MX records. |
| HTTPS/SSL – missing or expired certificates | Trust issue for visitors. |
| Suspicious on‑page content – “70% off today only!”, generic “secure checkout” badges | Classic scam tactics. |
| Missing or fake contact/policy pages – no easy way to reach out, copy‑pasted policies | Indicates low legitimacy. |
Each single signal isn’t proof of a scam, but several together raise the odds considerably.
Fetching WHOIS, DNS, HTTPS, and content info in Python
Required libraries
- Python 3.7 or above
python-whoisrequestsbeautifulsoup4dnspythontldextract
Install them with:
pip install python-whois requests beautifulsoup4 dnspython tldextract
The script
The script below:
- Fetches WHOIS info to check domain age and privacy.
- Checks DNS records for email (MX).
- Tries to fetch the homepage using HTTPS (falls back to HTTP).
- Scrapes for suspicious text (flash sales, missing policies, trust badges).
- Combines evidence into a risk score and verdict.
import re
import json
import whois
import requests
import dns.resolver
from bs4 import BeautifulSoup
from datetime import datetime, timezone
import tldextract
HEADERS = {"User-Agent": "Mozilla/5.0 (DomainRisk/0.1)"}
TIMEOUT = 10
def domain_age_days(w):
created = w.get("creation_date")
if isinstance(created, list):
created = created[0] if created else None
if not isinstance(created, datetime):
return None
if created.tzinfo is None:
created = created.replace(tzinfo=timezone.utc)
return (datetime.now(timezone.utc) - created).days
def whois_privacy(w):
text = " ".join(str(w.get(k, "")).lower() for k in ["registrar", "org", "name"])
return any(t in text for t in ["privacy", "proxy", "whoisguard", "redacted", "withheld"])
def resolve_dns(domain):
out = {"A": [], "MX": []}
try:
out["A"] = [r.to_text() for r in dns.resolver.resolve(domain, "A")]
except Exception:
pass
try:
out["MX"] = [r.to_text() for r in dns.resolver.resolve(domain, "MX")]
except Exception:
pass
return out
def fetch(url):
try:
r = requests.get(url, headers=HEADERS, timeout=TIMEOUT)
if 200 = 50 else ("Moderate Risk" if score >= 30 else "Lower Risk")
return {
"domain": norm,
"age_days": age,
"dns": dns,
"signals": signals,
"risk_score": score,
"risk_band": band,
}
if __name__ == "__main__":
import sys
if len(sys.argv) ")
raise SystemExit(1)
print(json.dumps(analyze(sys.argv[1]), indent=2))
Running the script
-
Save the code above as
scan.py. -
Open a terminal and install the dependencies (if you haven’t already):
pip install python-whois requests beautifulsoup4 dnspython tldextract -
Test a domain:
python scan.py example.com
The script outputs a JSON object containing:
- Domain age
- DNS status (especially MX/email)
- Detection of aggressive discounts, missing policies, unverified trust badges
- A
risk_scoreand a risk band (High Risk,Moderate Risk, orLower Risk)
Real‑world example
A review of a suspicious fashion store demonstrated how an automatic “robot” check helped catch a likely scam:
CKlinen.com: A Scam Fashion Store Review (replace with the actual URL if available)
Conclusion
With a little Python and the right checks, you can screen for scam domains faster and more accurately than ever before. Stay curious, share what works, and help keep friends and family safe when they shop online. If you have suggestions or want to contribute improvements to this script, feel free to comment or open a GitHub gist—every bit helps in the fight against online fraud.