I Built a Security Scanner That Audits PDFs Before You Send Them. Here's How. [Devlog #10]
Source: Dev.to
Overview
You’re about to email a contract. It looks clean, but it may still contain hidden metadata such as your name, your machine’s hostname, the original author’s company, and a creation timestamp. The recipient can see all of that.
Audit Report scans every PDF before you hit send and catches these leaks.
What the Scanner Checks
- Metadata leaks – Author, Creator, Producer, timestamps
- Hidden text layers – Content invisible at normal zoom
- Embedded scripts or form actions
- Non‑standard objects that shouldn’t be in a clean document
- Redacted content that wasn’t properly removed (pixel‑level check)
Example Result Structure (Rust)
pub struct AuditResult {
pub metadata_warnings: Vec,
pub hidden_content: Vec,
pub suspicious_objects: Vec,
pub risk_level: RiskLevel,
}
Core Auditing Function
pub fn audit_pdf(doc: &Document) -> AuditResult {
let mut result = AuditResult::default();
// 1. Check Info dictionary
if let Ok(info) = doc.get_info() {
for key in &["Author", "Creator", "Producer"] {
if info.get(*key).is_some() {
result.metadata_warnings.push(format!("{} field present", key));
}
}
}
// 2. Walk all objects for suspicious content
for (id, object) in &doc.objects {
if let Ok(stream) = object.as_stream() {
if contains_hidden_text(stream) {
result.hidden_content.push(HiddenLayer { id: *id });
}
}
}
result.risk_level = compute_risk(&result);
result
}
The audit result is written as a formatted PDF report, generated fully offline with no external libraries. The report includes:
- A green summary indicating overall safety
- A detailed list of checked items and any warnings
Why It Matters
Most people only check the visible content of a PDF. Nobody checks the invisible content—this tool does.
Get the Tool
Hiyoko PDF Vault – a ready‑to‑use version of the scanner:
https://hiyokoko.gumroad.com/l/HiyokoPDFVault
Follow the author on Twitter: @hiyoyok