Automate Website Security Audits with Technology Detection in Python
> **Source:** [Dev.to – Automate website security audits with technology detection in Python](https://dev.to/dapdev/automate-website-security-audits-with-technology-detection-in-python-2nbm)
## Technology Detection for Security Audits
Knowing what technologies a website runs is the first step in any security assessment.
Outdated CMS versions, exposed server headers, legacy JavaScript libraries — these are all attack vectors, and they’re all detectable.
In this tutorial we’ll build a Python tool that scans any website, identifies its technology stack, and flags potential security concerns based on what it finds.
---Why Technology Detection Matters for Security
Most security audits start with reconnaissance. Before testing for vulnerabilities, you need to know what’s running. A site using WordPress 5.x has a different risk profile than one running Next.js on Vercel.
Manually checking this is tedious. Browser extensions work for one site at a time but don’t scale. We’ll automate it with an API that detects 141+ technologies programmatically.
Setup
We’ll use the Technology Detection API on RapidAPI.
- Subscribe to the API on RapidAPI to obtain your API key.
- Install the required Python package:
pip install requestsStep 1 – Detect Technologies on a Target
The script below calls the Technology Detection API (via RapidAPI) and returns a structured inventory of everything running on the target site—CMS, server software, JavaScript libraries, analytics, CDN, and more.
import requests
from typing import Dict, List, Any
# Replace with your own RapidAPI key
RAPIDAPI_KEY = "YOUR_RAPIDAPI_KEY"
def detect_technologies(url: str) -> Dict[str, Any]:
"""
Query the Technology Detection API for the given URL.
Parameters
----------
url: str
The target website (e.g., "https://example.com").
Returns
-------
dict
The JSON response from the API.
"""
response = requests.get(
"https://technology-detection-api.p.rapidapi.com/detect",
params={"url": url},
headers={
"x-rapidapi-host": "technology-detection-api.p.rapidapi.com",
"x-rapidapi-key": RAPIDAPI_KEY,
},
)
response.raise_for_status()
return response.json()
# ----------------------------------------------------------------------
# Example usage
# ----------------------------------------------------------------------
if __name__ == "__main__":
result = detect_technologies("https://example.com")
technologies: List[Dict[str, Any]] = result.get("technologies", [])
print(f"Detected {len(technologies)} technologies")
for tech in technologies:
name = tech.get("name") or tech.get("technology", "Unknown")
category = tech.get("category", "Unknown")
version = tech.get("version", "")
version_str = f" v{version}" if version else ""
print(f" [{category}] {name}{version_str}")What the script does
- Sends a GET request to the RapidAPI endpoint with the target URL.
- Raises an exception if the request fails (
response.raise_for_status()). - Returns the parsed JSON payload.
- In the example block, it extracts the list of detected technologies and prints each one in a readable format:
[Category] TechnologyName vVersionReplace YOUR_RAPIDAPI_KEY with your actual RapidAPI key before running the script.
Step 2 – Define Security Rules
Now we’ll build a tiny rules engine that flags technologies with known security implications.
# ----------------------------------------------------------------------
# Security‑audit rules
# ----------------------------------------------------------------------
# Each rule contains:
# • a `match` callable that receives a technology dict and returns True/False
# • a `severity` level (high, medium, low, info)
# • a human‑readable `message` that can be formatted with ``name`` and ``version``
# ----------------------------------------------------------------------
SECURITY_RULES = [
{
"match": lambda t: t.get("name", "").lower() == "wordpress",
"severity": "medium",
"message": (
"WordPress detected — verify the version is current and plugins are "
"updated. WordPress sites are the #1 target for automated attacks."
),
},
{
"match": lambda t: (
t.get("name", "").lower() == "jquery"
and t.get("version", "").startswith(("1.", "2."))
),
"severity": "high",
"message": "Outdated jQuery version detected ({version}).",
},
# Add additional rules here as needed …
]
# ----------------------------------------------------------------------
# Security‑audit runner
# ----------------------------------------------------------------------
def run_security_audit(url: str) -> list[dict]:
"""
Detect technologies for *url*, evaluate them against ``SECURITY_RULES``,
and return a list of findings.
Each finding is a ``dict`` with the keys:
• severity
• technology
• version
• message
"""
separator = "=" * 60
print(f"\n{separator}")
print(f" SECURITY AUDIT: {url}")
print(f"{separator}\n")
# ``detect_technologies`` is assumed to be defined elsewhere
data = detect_technologies(url)
technologies = data.get("technologies", [])
if not technologies:
print(" No technologies detected.")
return []
print(f" Detected {len(technologies)} technologies\n")
findings: list[dict] = []
for tech in technologies:
for rule in SECURITY_RULES:
if rule["match"](tech):
name = tech.get("name") or tech.get("technology", "Unknown")
version = tech.get("version", "N/A")
message = rule["message"].format(name=name, version=version)
findings.append(
{
"severity": rule["severity"],
"technology": name,
"version": version,
"message": message,
}
)
# ------------------------------------------------------------------
# Sort by severity (high → medium → low → info)
# ------------------------------------------------------------------
severity_order = {"high": 0, "medium": 1, "low": 2, "info": 3}
findings.sort(key=lambda f: severity_order.get(f["severity"], 99))
# ------------------------------------------------------------------
# Pretty‑print the results
# ------------------------------------------------------------------
for f in findings:
sev = f["severity"].upper()
print(f"[{sev}] {f['technology']} ({f['version']}): {f['message']}")
return findings
# ----------------------------------------------------------------------
# Example execution
# ----------------------------------------------------------------------
if __name__ == "__main__":
target_url = "https://example.com"
run_security_audit(target_url)The script prints a concise report, ordered by severity, that highlights anything worth investigating further (e.g., outdated libraries, exposed server versions, third‑party trackers, etc.). Feel free to extend SECURITY_RULES with additional patterns that are relevant to your environment.
What’s Next?
- Integrate the audit into CI/CD pipelines to catch regressions early.
- Expand the rule set with CVE look‑ups or OWASP Top 10 mappings.
- Store findings in a ticketing system (Jira, GitHub Issues, …) for remediation tracking.
With a repeatable, automated detection step you’ll spend far less time on manual reconnaissance and more time on the real security work—remediating the issues that matter. Happy hunting!
Step 4 – Batch Audit Multiple Sites
If you need to audit several websites (or a client’s portfolio), you can run the scanner in a loop and collect the results for each domain.
import time
def batch_audit(urls):
"""Run a security audit for each URL in *urls* and return a dict of findings."""
all_findings = {}
for url in urls:
try:
findings = run_security_audit(url)
all_findings[url] = findings
except Exception as e: # pragma: no cover
print(f"\n Error scanning {url}: {e}")
all_findings[url] = []
time.sleep(1) # be polite to the target API
# ── Summary ────────────────────────────────────────────────────────────────
print("\n" + "=" * 60)
print(" BATCH AUDIT SUMMARY")
print("=" * 60)
for url, findings in all_findings.items():
high = sum(1 for f in findings if f["severity"] == "high")
medium = sum(1 for f in findings if f["severity"] == "medium")
domain = url.replace("https://", "").rstrip("/")
print(f" {domain}: {high} high, {medium} medium, {len(findings)} total")
return all_findings
# Example usage
sites = [
"https://example.com",
"https://yoursite.com",
"https://clientsite.com",
]
batch_audit(sites)Step 5: Export as JSON Report
For documentation or integration with other security tools you can dump the findings to a JSON file.
import json
from datetime import datetime
def export_report(url, findings, filename=None):
"""Export scan results to a JSON file.
Args:
url (str): The target URL that was scanned.
findings (list): List of finding dictionaries.
filename (str, optional): Desired output filename. If omitted,
a name is generated from the target domain and the current date.
Returns:
str: The path to the generated JSON report.
"""
report = {
"target": url,
"scan_date": datetime.utcnow().isoformat(),
"total_findings": len(findings),
"severity_counts": {
"high": sum(1 for f in findings if f["severity"] == "high"),
"medium": sum(1 for f in findings if f["severity"] == "medium"),
"low": sum(1 for f in findings if f["severity"] == "low"),
"info": sum(1 for f in findings if f["severity"] == "info"),
},
"findings": findings,
}
if filename is None:
domain = url.replace("https://", "").replace("/", "_").rstrip("_")
filename = f"audit_{domain}_{datetime.now().strftime('%Y%m%d')}.json"
with open(filename, "w") as f:
json.dump(report, f, indent=2)
print(f"\nReport exported to {filename}")
return filename
# Example usage
export_report("https://example.com", findings)Extending the Audit
The foundation can be expanded in several practical ways:
- Version database – Keep a mapping of technologies to their latest versions and automatically flag anything outdated.
- CVE lookup – Cross‑reference detected technology versions against the NIST NVD API for known vulnerabilities.
- Scheduled scans – Run audits on a cron schedule and alert when a site’s technology stack changes unexpectedly.
- CI/CD integration – Add a post‑deploy step that scans your site and fails the pipeline if high‑severity findings appear.
Wrapping Up
Technology detection is a practical starting point for security reconnaissance. By combining the Technology Detection API with a simple rules engine, you get an automated audit tool that flags real concerns in seconds.
This doesn’t replace a full penetration test, but it’s a fast way to surface low‑hanging fruit—outdated libraries, exposed server versions, and forgotten third‑party scripts.
- Subscribe on RapidAPI and start scanning.
What security checks would you add to the rules engine? Share your ideas in the comments.