Secure Coding : Path Traversal ( os.path.join)

4 min read1 day ago

Modern microservices often expose APIs for downloading reports, logs, or documents.
If file paths are not validated properly, attackers can abuse path traversal (../) to read sensitive files from the server.
This is known as Local File Disclosure via Path Traversal.

✅ Good Flow — Intended Behavior

A support agent requests a report via the UI:

https://report-fetcher.mydomain.com/reports?file=sales-2025-09-05.csv

Step 1: User enters the request in the UI.
Step 2: UI calls the backend API with file=sales-2025-09-05.csv.
Step 3: The Reports microservice fetches /srv/reports/sales-2025-09-05.csv.
Step 4: The correct file is returned to the user.

❌ Malicious Flow — Exploit in Action

An attacker manipulates the query string:

https://report-fetcher.mydomain.com/reports?file=../../../../etc/passwd

Step 1: Attacker crafts payload with ../ sequences.
Step 2: API call passes file=../../../../etc/passwd.
Step 3: Backend resolves this into /etc/passwd.
Step 4: Sensitive OS file is leaked to the attacker.

⚠️ Vulnerable Code (Root Cause)

from flask import Flask, request, send_file, abort
import os
app = Flask(__name__)
BASE_DIR = "/srv/reports"
@app.get("/reports")
def get_report():
    filename = request.args.get("file")  # User input
    if not filename:
        abort(400, "Missing file parameter")
    # ❌ Vulnerable: os.path.join does not stop traversal
    full_path = os.path.join(BASE_DIR, filename)
    if not os.path.exists(full_path):
        abort(404, "File not found")
    return send_file(full_path)

🔎 Why it’s vulnerable

os.path.join simply concatenates strings.
Attackers can insert ../ to escape /srv/reports.
Any file readable by the process (e.g., /etc/passwd, .env, configs) may be leaked.

🛡️ Safe Code (Defensive Approach)

from flask import Flask, request, send_from_directory, abort
from pathlib import Path
app = Flask(__name__)
BASE_DIR = Path("/srv/reports").resolve()
ALLOWED_EXTENSIONS = {".csv", ".pdf"}
def is_safe(name: str) -> bool:
    p = Path(name)
    return (
        not p.is_absolute() and
        p.suffix.lower() in ALLOWED_EXTENSIONS and
        ".." not in p.parts
    )
@app.get("/reports")
def get_report_safe():
    filename = request.args.get("file")
    if not filename or not is_safe(filename):
        abort(400, "Invalid file request")
    safe_path = (BASE_DIR / filename).resolve()
    try:
        safe_path.relative_to(BASE_DIR)
    except ValueError:
        abort(403, "Traversal attempt blocked")
    if not safe_path.exists() or not safe_path.is_file():
        abort(404, "File not found")
    return send_from_directory(BASE_DIR, filename, as_attachment=True)

Why it’s safe

✅ Canonicalization (Path.resolve()) normalizes paths.
✅ Containment check ensures file stays inside /srv/reports.
✅ Only .csv and .pdf allowed.
✅ Traversal tokens (..), absolute paths, and control chars blocked.
✅ Defense-in-depth: run app as non-root, mount /srv/reports read-only.

🔧 Remediation Approaches (Weak → Strong)

Blacklisting traversal patterns (../)
✅ Easy to implement
❌ Trivially bypassed with encodings (%2e%2e%2f, mixed slashes)
❌ Not recommended
Basic allow-list of extensions
✅ Stops some attacks
❌ Still bypassable with symlinks or crafted names
Canonicalization with Path.resolve()
✅ Normalizes paths and defeats encoding tricks
❌ Without containment check, files outside /srv/reports can still leak
Containment enforcement (.relative_to(BASE_DIR))
✅ Ensures resolved path stays inside the intended directory
❌ Needs extra handling for symlinks
Symlink detection & blocking
✅ Prevents attackers from planting symlinks to sensitive files
❌ Adds filesystem checks overhead
Opaque file IDs / signed tokens instead of raw names
✅ Users never directly control filenames
✅ Strongest mitigation for file disclosure
Example:

/reports?id=12345

Internally maps to /srv/reports/sales-2025-09-05.csv

7. Operational hardening (defense-in-depth)
✅ Run service as non-root
✅ Mount /srv/reports read-only
✅ Add WAF/API Gateway rules against traversal patterns
✅ Log suspicious requests and alert SOC

📊 Compliance Mapping :

OWASP Top 10 (2021):

A01: Broken Access Control — Path traversal is fundamentally an access control failure: users access files outside their intended scope.
A05: Security Misconfiguration — Lack of safe defaults, missing input validation, and unsafe file handling APIs all contribute to misconfiguration.
A04: Insecure Design — Directly trusting user-controlled filenames without architectural safeguards.

MITRE ATT&CK:

T1005: Data from Local System — Attacker collects sensitive files from local file systems.
T1083: File and Directory Discovery — Attackers often enumerate files to find valuable targets before exploitation.

🎯 Key Takeaways

Path traversal attacks remain common and dangerous in modern microservices.
Simple string concatenation of user input → file paths is never safe.
Use canonicalization + containment checks + allow-lists as minimum defenses.
Best practice: don’t trust filenames at all — issue opaque IDs or signed URLs instead.
Always combine code-level controls with operational hardening (least privilege, logging, WAF).
Map controls to OWASP and MITRE for better security posture and compliance alignment.

👉 With these layered defenses, your microservices will be resilient against Local File Disclosure via Path Traversal, and your remediation story will satisfy both security teams and auditors.