Path Traversal: What It Is, Why It's Dangerous, and How to Stop Attackers from Reading Files They Shouldn't
Source: Dev.to
Path traversal is a web vulnerability where an attacker reads files outside the directory your application intends to serve. It sounds simple, but the impact can be severe. One missing validation and an attacker can walk straight into your .env, your database backups, or your cloud credentials using nothing more than ../ in a URL. The mock server used in this demonstration is available at https://github.com/jercatallo/lab-path-traversal if you want to follow along and try the vulnerability yourself. Path traversal, also called directory traversal, happens when an application takes a filename from user input and builds a file path with it, without checking where that path actually leads. Attackers chain ../ sequences to climb up the directory tree and reach files the application was never meant to expose. Common targets include: Environment files with database credentials Configuration files with cloud API keys Database backups with user data and password hashes Internal service credentials These three vulnerabilities are related but different. Here is a quick comparison:
Feature Path Traversal LFI RFI
File source Local filesystem Local filesystem Remote server
Execution Read only Can execute code Can execute code
Main goal Read sensitive files Read or execute local files Execute remote malicious code
Common payload
../ sequences
../ + file paths http://attacker.com/shell.php
Path traversal only reads files. LFI goes further and can execute them. RFI pulls code from an external server entirely. This demonstration focuses on path traversal. Path traversal happens when developers trust user input without validating the resulting file path. Here are common patterns that cause this vulnerability. PHP - Direct file path concatenation without validation:
The code takes the file parameter directly from the URL and appends it to the base path. If an attacker sends ?file=../.env, the server reads /var/www/files/../.env which resolves to /var/www/.env. Python - Missing path boundary check: from flask import Flask, request, send_file import os
app = Flask(name) BASE_DIR = “/var/www/files”
@app.route(“/files”) def serve_file(): filename = request.args.get(“file”) filepath = os.path.join(BASE_DIR, filename) return send_file(filepath)
The os.path.join() function does not prevent directory traversal. When filename is ../.env, the result is /var/www/files/../.env which the server resolves to /var/www/.env. Node.js - Unsafe path construction: const express = require(‘express’); const fs = require(‘fs’); const path = require(‘path’); const app = express();
app.get(‘/files’, (req, res) => { const filename = req.query.file; const filepath = path.join(__dirname, ‘files’, filename); res.sendFile(filepath); });
The path.join() method preserves ../ sequences. An attacker sending ?file=../config/production.json gets the full production configuration file. Java - File object without canonical path validation: @WebServlet(“/files”) public class FileServlet extends HttpServlet { protected void doGet(HttpServletRequest req, HttpServletResponse resp) { String filename = req.getParameter(“file”); File file = new File(“/var/www/files”, filename); // Missing: check if file.getCanonicalPath() starts with base directory Files.copy(file.toPath(), resp.getOutputStream()); } }
The File constructor accepts any path string. Without calling getCanonicalPath() and verifying it stays within the allowed directory, traversal sequences pass through unchecked. The common mistake across all these examples is the same: user input flows directly into file path construction without checking where the final path actually points. All credentials, tokens, and data shown here are simulated values created for this lab. The environment runs on a local machine with no real user data. Never test path traversal on systems you do not own or have written permission to test. Unauthorized file access is illegal in most jurisdictions. Before attacking, we need to discover what endpoints exist on the target application. In real-world scenarios, attackers use fuzzing, directory brute-forcing, or source code review to find these. This lab provides an /api endpoint that lists available routes for training purposes. curl http://localhost:8080/api | jq
The response reveals two endpoints: /api itself (which lists endpoints) and /files which serves files using a file query parameter. The /files endpoint is our target for path traversal. In a real engagement, you would not have this convenient listing and would need to discover it through reconnaissance techniques. The application runs on localhost:8080 with a /files endpoint that serves user profile icons. We start by requesting a legitimate file to confirm normal behavior. curl http://localhost:8080/files?file=user-1.svg
The server returns the full SVG markup for the profile icon. The endpoint works as expected when given a valid filename within the intended directory. No errors, no restrictions visible from the outside. ../
Now we test if the server validates the file parameter. We use ../ to move one level up from the files directory and target an admin credentials file. curl http://localhost:8080/files?file=../admin/credentials.txt
The server returns the credentials file without any error. You can see the fake admin email and password, three API tokens with different permission levels (read-only, read-write, admin), and internal service URLs for the payment and user services. The application accepted the ../admin/ path without any validation. This confirms path traversal is possible. With directory traversal confirmed, we go after the .env file. This file typically holds database connection details that the application loads at startup. curl http://localhost:8080/files?file=../.env
The response exposes five environment variables: DATABASE_HOST, DATABASE_PORT (5432, a PostgreSQL port), DATABASE_USER, DATABASE_PASSWORD, and DATABASE_NAME. In a real application, these details give an attacker direct access to your database server. One request, and the database is open. Backup files are another high-value target. They are often stored on the same server and contain a full snapshot of your data. curl http://localhost:8080/files?file=../backups/db_dump.sql
The dump includes the full schema and data for two tables. The users table has usernames, bcrypt password hashes, emails, and roles. The payments table has card numbers, CVV codes, and expiry dates. An attacker reading this file gets everything needed to attempt credential stuffing and has access to payment data that could lead to PCI DSS violations. The last target is the production configuration file. These files often bundle multiple secrets together in one place. curl http://localhost:8080/files?file=../config/production.json
The JSON file reveals three sections: database connection details with a separate internal hostname (fake-db-host-name.internal), AWS credentials with an access key ID and secret access key, and a JWT signing secret with a 7-day expiry. With the AWS keys, an attacker can enumerate and access cloud resources. With the JWT secret, they can forge authentication tokens and impersonate any user. Path traversal is preventable. The root cause is the application trusting user input to build file paths. Here are the key fixes: Resolve the canonical path: Use realpath() or equivalent to resolve the full path before doing anything with it. Validate against an allowlist: Check that the resolved path starts with the intended base directory. Reject anything outside it. Avoid raw path construction: Use file IDs or database references instead of accepting filenames directly from users. Apply least privilege: Run the application with a user that only has read access to the files directory. Even if traversal succeeds, the process cannot read files it has no permission to access. Secure path validation example in Python: import os
ALLOWED_DIR = “/var/www/files”
def serve_file(filename): full_path = os.path.realpath(os.path.join(ALLOWED_DIR, filename)) if not full_path.startswith(ALLOWED_DIR + os.sep): raise ValueError(“Access denied: path traversal detected”) return open(full_path, “rb”).read()
The key line is the startswith check after realpath(). This resolves any ../ sequences before the check, so an attacker cannot bypass it by encoding or chaining traversal sequences. Path traversal is a low-effort, high-impact vulnerability. All five steps above used a single curl command with ../ in the filename parameter. No special tools, no authentication bypass, just a missing validation check on the server side. The attack chain here went from reading a profile icon to exposing admin credentials, database environment variables, a full database backup with payment data, and AWS cloud credentials. Each of those steps is one request. Protect your applications by resolving canonical paths, validating against a strict base directory, and running with least privilege.