Skip to content

πŸ›‘οΈ PII Guard is an LLM-powered tool that detects and manages Personally Identifiable Information (PII) in logs β€” designed to support data privacy and GDPR compliance

License

Notifications You must be signed in to change notification settings

rpgeeganage/pII-guard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

74 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ›‘οΈ PII Guard

PII Guard is an LLM-powered tool that detects and manages Personally Identifiable Information (PII) in logs β€” designed to support data privacy and GDPR compliance.

⚠️ This is a personal side project
Built to explore how Large Language Models can detect sensitive data in logs more intelligently than traditional regex-based approaches.

πŸ“š Table of Contents


🧠 About

This project experiments with Large Language Models (LLMs) β€” specifically the gemma:3b model running locally via Ollama β€” to evaluate how effectively they can identify PII in both structured and unstructured log data.

🧠 LLM-Based Detection with Ollama

  • Uses gemma:3b through the Ollama runtime
  • Analyzes logs using natural language understanding
  • Handles real-world, messy logs better than regex
  • Work in progress β€” contributions welcome!

πŸ’‘ Why Use LLMs for PII Detection?

  • πŸ” Identifies PII even when it's obfuscated, incomplete, or embedded in text
  • 🌐 Handles multilingual input and inconsistent formats
  • 🧠 Leverages semantic context instead of relying on static patterns
  • πŸ§ͺ Ideal for experimenting with privacy tooling powered by AI

Traditional detection rules often break under complexity β€” LLMs provide contextual intelligence.


🧾 PII Types Detected

πŸ‘€ Identity Information

full-name, first-name, last-name, username, email, phone-number, mobile, address, postal-code, location

🧠 Sensitive Categories (GDPR Art. 9)

racial-or-ethnic-origin, political-opinion, religious-belief, philosophical-belief, trade-union-membership, genetic-data, biometric-data, health-data, sex-life, sexual-orientation

🧾 Government & Financial Identifiers

national-id, passport-number, driving-license-number, ssn, vat-number, credit-card, iban, bank-account

🌐 Network & Device Information

ip-address, ip-addresses, mac-address, imei, device-id, device-metadata, browser-fingerprint, cookie-id, location-coordinates

🚘 Vehicle Information

license-plate


πŸ—οΈ Architecture

This is how PII Guard works:

architecture


πŸš€ Getting Started

  • Clone the repo and start everything with a single command:
make all-in-up
  • Shut down everything with:
make all-in-down

This will launch the full stack:

  • 🐘 PostgreSQL
  • πŸ”Ž Elasticsearch
  • πŸ‡ RabbitMQ
  • πŸ€– Ollama (with gemma:3b)
  • 🌐 PII Guard dashboard and backend API

πŸ§ͺ Try It Out

πŸ–₯️ Web Interface

Visit: http://localhost:3000

πŸ”Œ API Endpoint

http://localhost:8888/api/jobs

πŸŒ€ Submit Sample Logs (cURL)

curl --location 'http://localhost:8888/api/jobs/flush' \
--header 'Content-Type: application/json' \
--data-raw '{
  "version": "1.0.0",
  "logs": [
    "{\"timestamp\":\"2025-04-21T15:02:10Z\",\"service\":\"auth-service\",\"level\":\"INFO\",\"event\":\"user_login\",\"requestId\":\"1a9c7e21\",\"user\":{\"id\":\"u9001001\",\"name\":\"Leila Park\",\"email\":\"[email protected]\"},\"srcIp\":\"198.51.100.15\"}",
    "{\"timestamp\":\"2025-04-21T15:02:12Z\",\"service\":\"cache-service\",\"level\":\"DEBUG\",\"event\":\"cache_miss\",\"requestId\":\"82c5cc9f\",\"cacheKey\":\"product_44291_variant_blue\",\"region\":\"us-east-1\"}"
  ]
}'

πŸ§ͺ How to Test

Please refer to the Testing PII Guard guide for instructions on running the test setup, including simulated log generation and stress testing.

This guide will help you set up a test environment to evaluate the performance and detection accuracy of PII Guard.


πŸ“‚ Project Structure


πŸ™Œ Suggestions & Contributions

Got a bug to report? Feature request? Wild idea? Bring it on!

  • πŸ› Bug reports help improve stability
  • ✨ Feature requests help shape the product
  • πŸ’¬ Suggestions, feedback, and contributions are all welcome!

About

πŸ›‘οΈ PII Guard is an LLM-powered tool that detects and manages Personally Identifiable Information (PII) in logs β€” designed to support data privacy and GDPR compliance

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages