Completing RFI's are extremely boring. No one likes to do them. This system is designed to help you complete RFI's faster and more efficiently. You can then focus on half finishing cool projects and playing with new technologies.
A local RAG (Retrieval-Augmented Generation) system for searching through RFI documents, policies, SOPs, and other technical documentation using Qdrant vector database.
If it works, please reach out and thank me. If it doesn't, well it worked on my machine.
- Local Embeddings: Uses SentenceTransformers (all-MiniLM-L6-v2) for semantic search
- GPU Acceleration: Optimized batch processing with CUDA support when available
- Multiple File Formats: Supports PDF, DOCX, XLSX, TXT, MD
- FastAPI Server: RESTful API with automatic documentation
- Qdrant Vector Database: High-performance vector storage with Docker deployment
- Semantic Search: Find documents by meaning, not just keywords
- Stable Dependencies: Tested with PyTorch 1.13.1, transformers 4.21.3
The system requires specific directory structure. Ensure you have:
# Create the main project directory structure
mkdir -p existing_files/{Policies,"RFIs and Other",SOPs,Statements,"Support Docs"}# Create virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install stable dependencies
pip install -r requirements.txt# Start Qdrant with Docker (includes health checks)
./start_qdrant.shThis will:
- Start Qdrant vector database on localhost:6333
- Wait for the database to be ready
- Provide access to Qdrant UI at http://localhost:6333/dashboard
python setup_qdrant.pyThis will:
- Process all files in
existing_files/ - Create embeddings using CPU/GPU (auto-detected)
- Store vectors in Qdrant database
- Show statistics about processed documents
python api_server_qdrant.pyServer runs at: http://localhost:8000 API docs at: http://localhost:8000/docs
curl "http://localhost:8000/search/simple?query=GDPR+compliance&limit=3"curl -X POST "http://localhost:8000/search" \
-H "Content-Type: application/json" \
-d '{
"query": "data security policies",
"top_k": 5,
"min_score": 0.7,
"category_filter": "Policy"
}'curl "http://localhost:8000/stats"curl "http://localhost:8000/health"You MUST run Claude Code from this project directory for Claude to discover and use the API:
# Navigate to the project directory
cd /path/to/rfi_vector_search
# Start Claude Code from this directory
claude
# OR use the full path
claude /path/to/rfi_vector_searchClaude Code doesn't automatically know about your API. It discovers it through:
- CLAUDE.md: Contains system documentation and API instructions
- README.md: Documents available endpoints and usage
- Source Code: Can read
api_server_qdrant.pyto understand the API - Running Processes: Can detect the server running on port 8000
Once the server is running and you're in the correct directory, Claude can search your documents:
# Search for compliance-related documents
curl "http://localhost:8000/search/simple?query=ISO+27001+compliance"
# Find specific policies
curl "http://localhost:8000/search/simple?query=data+retention+policy"
# Search SOPs
curl "http://localhost:8000/search/simple?query=incident+response+procedure"- ✅ Run from project directory: Claude needs access to CLAUDE.md and documentation
- ✅ Server must be running:
python api_server_qdrant.pyin background - ✅ Qdrant must be running:
./start_qdrant.shexecuted first - ✅ Documents processed:
python setup_qdrant.pycompleted successfully
Note: Claude uses standard HTTP requests via curl - the same commands you would use manually. It's not a special integration, just well-documented local API usage.
The system automatically categorizes documents:
- RFI: Files in "RFIs and Other/"
- Policy: Files in "Policies/"
- SOP: Files in "SOPs/"
- Statement: Files in "Statements/"
- Support Doc: Files in "Support Docs/"
The system expects the following directory structure to be in place:
rfi_vector_search/
├── existing_files/ # Source documents directory
│ ├── Policies/ # Policy documents (POL-*.docx)
│ ├── RFIs and Other/ # RFI documents and other files
│ ├── SOPs/ # Standard Operating Procedures (SOP-*.docx)
│ ├── Statements/ # Compliance statements (STAT-*.docx)
│ └── Support Docs/ # Additional support documentation
├── qdrant_data/ # Qdrant vector database storage (Docker volume)
├── venv/ # Python virtual environment
├── __pycache__/ # Python bytecode cache
├── qdrant_processor.py # Core document processing logic (Qdrant)
├── api_server_qdrant.py # FastAPI web server (Qdrant)
├── setup_qdrant.py # Initial setup and document processing
├── start_qdrant.sh # Qdrant startup script
├── docker-compose.yml # Qdrant Docker configuration
├── requirements.txt # Python package dependencies
├── CLAUDE.md # Claude Code guidance
└── README.md # This documentation
existing_files/: This is the main source directory containing all documents to be processed. The system automatically categorizes documents based on their subdirectory:
- Policies/: Contains policy documents (POL-*.docx files)
- RFIs and Other/: Contains RFI documents, templates, and other miscellaneous files
- SOPs/: Contains Standard Operating Procedures (SOP-*.docx files)
- Statements/: Contains compliance and regulatory statements (STAT-*.docx files)
- Support Docs/: Contains additional support documentation
qdrant_data/: Docker volume for Qdrant vector database storage:
- Vector embeddings of processed documents
- Qdrant database files and indexes
- Search metadata and configuration
venv/: Python virtual environment (created during setup) pycache/: Python bytecode cache (auto-generated)
- Import Errors: Ensure virtual environment is activated and dependencies installed
- No Results: Check that documents were processed successfully with
setup_qdrant.py - Server Won't Start: Check port 8000 isn't already in use
- Qdrant Connection: Ensure Docker is running and Qdrant container is healthy
- GPU Issues: System falls back to CPU automatically if CUDA unavailable
If you encounter version conflicts, use these stable versions:
pip install torch==1.13.1 transformers==4.21.3 sentence-transformers==2.2.2 qdrant-client- GPU Usage: System auto-detects CUDA and optimizes batch sizes for available VRAM
- Large Files: System automatically limits processing for very large PDFs/Excel files
- Memory: Clear GPU cache between large processing jobs if needed
- Add files to
existing_files/directory - Run:
curl -X POST "http://localhost:8000/process" - New documents will be processed and added to the database
Import the provided Postman collection (RFI_Vector_Search.postman_collection.json) to test all API endpoints. The collection includes 11 comprehensive tests:
- Health Check (
/health): Verify system status and Qdrant connection - Root Health Check (
/): Basic API availability test - Get Statistics (
/stats): Database statistics and document counts
- Simple Search - Compliance: Test basic search functionality
- Simple Search - Data Security: Search for specific policy types
- Advanced Search - High Scored Results: Test minimum score filtering
- Advanced Search - Category Filter: Test category-specific searches
- Advanced Search - SOPs: Search Standard Operating Procedures
- Process Documents (
/process): Test document ingestion endpoint
- Empty Query Handling: Test validation for empty search queries
- Large Limit Test: Test behavior with large result limits
- Automated Testing: Each request includes test scripts to verify responses
- Environment Variables: Uses
{{base_url}}variable (default: http://localhost:8000) - Response Validation: Tests check response structure, data types, and business logic
- Error Handling: Tests verify proper error responses for edge cases
- Import the collection into Postman
- Ensure the API server is running (
python api_server_qdrant.py) - Run individual tests or the entire collection
- View test results in the Postman Test Results tab