Best Beginner’s Guide For Cybersecurity Recon with Python
Source: Dev.to
Cybersecurity reconnaissance is the first and most critical step in understanding a target’s digital footprint. As a beginner, knowing where to look, how to look, and what tools to use can dramatically increase your effectiveness in network security, penetration testing, and OSINT investigations.
Python has become the go‑to language for security specialists due to its simplicity, versatility, and powerful libraries. This guide breaks down the foundations of recon, DNS enumeration, and network scanning using Python, so beginners can grasp concepts practically and immediately start experimenting.
By the end of this guide you will be able to
- Understand the difference between passive and active reconnaissance.
- Define the scope and target effectively for any recon project.
- Use Python to query DNS records, subdomains, and certificate‑transparency logs.
- Perform asynchronous network scans for host and service discovery.
- Organise recon data efficiently for analysis and reporting.
- Apply OPSEC and rate‑limiting to stay stealthy during recon.
These skills form a solid foundation for penetration testing, bug bounty hunting, and OSINT investigations.
Prerequisites
Before starting, ensure you have a Python 3.12 environment ready. The following libraries are recommended for recon tasks:
| Library | Purpose |
|---|---|
asyncio / trio | Handle thousands of tasks concurrently without threads |
httpx | Async HTTP/HTTPS requests with HTTP/2, proxy, and SOCKS support |
aiodns | Asynchronous DNS resolution with DNSSEC support |
ipwhois | ASN and prefix lookups |
rich | Pretty terminal output with progress bars |
pandas | Data organization, CSV/HTML export |
Setup
python3 -m venv recon
source recon/bin/activate
pip install httpx[http2] aiodns ipwhois rich pandas
Recon Fundamentals: Active vs Passive
Passive Recon
- You do not touch the target.
- Sources include WHOIS, crt.sh, Shodan, GitHub, and leaked databases.
- Stealthy; leaves no logs on the target.
Active Recon
- Direct probing via DNS queries, port scanning, banner grabbing, and web crawling.
- Powerful but generates logs and may trigger firewalls.
Rule: Always start with passive recon. It’s safer, cost‑free, and helps narrow down what to probe actively.
Canonical Workflow
- Scope Definition – Define IP ranges, domains, and employee aliases.
- Passive Recon – Gather public artefacts.
- Correlation & Pivot – Deduplicate, enrich, generate leads.
- Active Recon – Probe live hosts, services, and versions.
- Reporting – Export structured JSON or CSV for analysis.
DNS Records Overview
DNS is the foundation of how the Internet identifies and routes traffic. Understanding record types gives early insights into an organization’s online structure.
A / AAAA – Host to IP Mapping
- A – IPv4 address.
- AAAA – IPv6 address. These records reveal where a service is hosted (cloud provider, on‑prem, shared hosting).
CNAME – Aliases and CDNs
A CNAME points one domain to another domain, often exposing third‑party services:
blog.example.com → cname → example-blog.hosting.net
Typical services revealed:
- CDN providers (Cloudflare, Akamai)
- Email platforms
- SaaS dashboards
- Cloud hosting environments (AWS, GCP, Azure)
NS – Authoritative Servers
NS records indicate which servers are authoritative for a domain, helping you infer:
- Hosting provider
- Whether DNS is self‑managed or outsourced
- Redundancy/failover configuration
- Possible subdomains via zone misconfiguration
Note: Self‑hosted NS servers often signal a large internal infrastructure.
MX – Email Routing
MX records show the mail servers responsible for receiving email:
- Reveal use of Google Workspace, Microsoft 365, or custom mail servers.
- Highlight legacy or insecure mail systems.
- Expose additional subdomains tied to mail infrastructure.
TXT – Security Policies & Verification Artefacts
Common uses:
- SPF, DMARC, DKIM (email authentication)
- Cloud/SaaS verification tokens
- Public security disclosures
- Domain metadata
Note: SPF, DKIM, and DMARC together help prevent spoofed emails.
SRV – Service Discovery
SRV records specify hostname and port for services such as SIP, LDAP, Kerberos, VoIP, Microsoft services, and game servers. They can uncover:
- Internal authentication services
- Directory services
- Infrastructure dependencies not visible on the public web
Subdomain Discovery – Expanding the Attack Surface
Subdomains often host unique applications, APIs, admin panels, or onboarding systems (e.g., api.example.com, vpn.example.com, dev.example.com). Discovering them widens the attack surface.
Passive Enumeration
Collect information from external sources that already monitor the Internet:
- Certificate Transparency (CT) logs
- Historical DNS data
- Search‑engine dorks
Each source reveals different layers of a domain’s evolution.
Certificate Transparency Logs
Every HTTPS site must publish its SSL/TLS certificate to public CT logs. This includes subdomains that may have been intended to stay private.
- crt.sh – Public CT log search engine
- bufferover.run – Offers CT, DNS, and reverse‑lookup datasets
Example subdomains found via CT logs
api.example.com
dev.example.com
staging-api.example.com
internal-vpn.example.com
Beginner‑friendly Python script to fetch CT logs
import requests
domain = "example.com"
url = f"https://crt.sh/?q=%25.{domain}&output=json"
response = requests.get(url, timeout=10)
if response.status_code == 200:
entries = response.json()
subdomains = {entry["name_value"] for entry in entries}
for sub in sorted(subdomains):
print(sub)
else:
print("Failed to fetch CT logs")
The script queries crt.sh for all certificates containing the target domain, extracts unique subdomain names, and prints them. You can pipe the output to a file for later analysis.
End of guide.