Skip to content

ron-42/retriever-me

Repository files navigation

Retriever-Me: Vector Search API

A high-performance retrieval API built with FastAPI and LangChain that provides semantic document search capabilities using Pinecone vector database.

Overview

Retriever-Me is a REST API server that enables semantic search over your document collection. It uses OpenAI embeddings to convert queries into vector representations and performs similarity search using Pinecone vector database.

Key features:

  • Fast and scalable semantic search
  • Similarity threshold filtering
  • Metadata-based filtering
  • Asynchronous request handling
  • Robust error handling and logging

Architecture

┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│   FastAPI   │ -> │  Retriever  │ -> │  Embedding  │ -> │  Pinecone   │
│   Server    │    │    Layer    │    │    Model    │    │     DB      │
└─────────────┘    └─────────────┘    └─────────────┘    └─────────────┘
  • Server Layer: FastAPI REST API for handling requests
  • Retriever Layer: LangChain retriever for document fetching
  • Embedding Layer: OpenAI embeddings for vector conversion
  • Vector Store: Pinecone vector database for similarity search

Installation

1. Clone the repository

git clone https://github.com/your-username/retriever-me.git
cd retriever-me

2. Set up Conda environment

# Create a new conda environment
conda create -n retrieval-pipeline python=3.12 -y

# Activate the environment
conda activate retrieval-pipeline

# Install pip inside the conda environment (if needed)
conda install pip -y

3. Install dependencies

pip install -r requirements.txt

4. Set up environment variables

Create a .env file in the project root directory by copying the example file:

cp .env.example .env

Then edit the .env file to add your API keys:

# API Keys (required)
OPENAI_API_KEY=your_openai_api_key_here
PINECONE_API_KEY=your_pinecone_api_key_here

# Pinecone Settings
PINECONE_INDEX=your_pinecone_index_name
PINECONE_NAMESPACE=your_pinecone_namespace

Configuration

Configuration settings are managed in config/settings.py. Key settings include:

  • EMBEDDING_MODEL: Model used for text embeddings (default: "text-embedding-3-small")
  • DEFAULT_TOP_K: Default number of results to return (default: 3)
  • DEFAULT_SCORE_THRESHOLD: Minimum similarity score threshold (default: 0.4)
  • NAMESPACE: Pinecone namespace for data partitioning

Usage

Starting the server

Run the server locally:

# Make sure your conda environment is activated
conda activate retrieval-pipeline

# Start the server
python server.py

The server will start on http://localhost:8000.

For network access:

# Find your IP address
ip addr show

# Access from other computers using
http://your-ip-address:8000

API Endpoints

Health Check

GET /health

Query Documents

POST /query

Request body:

{
  "query": "What is machine learning?",
  "top_k": 3,
  "threshold": 0.4,
  "filter": {
    "category": "technology",
    "source": "articles"
  }
}

Response:

{
  "request_id": "d13ef401-6a01-4fbc-a4d2-a84815c8e83b",
  "query": "What is machine learning?",
  "documents": [
    {
      "content": "Machine learning is a subset of artificial intelligence that provides systems the ability to automatically learn and improve from experience without being explicitly programmed.",
      "metadata": {
        "category": "technology",
        "source": "articles",
        "author": "tech_writer",
        "url": "https://example.com/tech/machine-learning"
      }
    }
  ],
  "took_ms": 345
}

Project Structure

  • server.py: Main FastAPI application
  • main.py: CLI test script
  • config/: Configuration settings and logging
  • embeddings/: Embedding model implementations
  • retriever/: Retriever implementations
  • vectorstore/: Vector database connectors
  • utils/: Utility functions and metrics
  • logs/: Application logs directory

Logging

The application uses a structured logging system for tracking operations and debugging:

  • Log files are stored in the logs/ directory
  • Default log file: logs/pipeline.log
  • Logs include:
    • Request details (query, parameters, request ID)
    • Retrieval statistics (time taken, number of documents)
    • API operations
    • Errors and exceptions with tracebacks
    • Server startup/shutdown events

Logging levels can be adjusted in config/logger.py based on your needs (DEBUG, INFO, WARNING, ERROR).

Example log entry:

[2025-05-22 14:47:40] [INFO] [api_server] Request 0cdda49f-1412-4cd7-812e-e5654d2b1a22: Query received: 'What is machine learning?' (top_k: 3, threshold: 0.4)
[2025-05-22 14:47:43] [INFO] [api_server] Request 0cdda49f-1412-4cd7-812e-e5654d2b1a22: Retrieved 3 documents in 3136ms

Development

To run the server in development mode with auto-reload:

# Activate conda environment
conda activate retrieval-pipeline

# Run with auto-reload
uvicorn server:app --reload

License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages