Retriever-Me: Vector Search API

A high-performance retrieval API built with FastAPI and LangChain that provides semantic document search capabilities using Pinecone vector database.

Overview

Retriever-Me is a REST API server that enables semantic search over your document collection. It uses OpenAI embeddings to convert queries into vector representations and performs similarity search using Pinecone vector database.

Key features:

Fast and scalable semantic search
Similarity threshold filtering
Metadata-based filtering
Asynchronous request handling
Robust error handling and logging

Architecture

┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│   FastAPI   │ -> │  Retriever  │ -> │  Embedding  │ -> │  Pinecone   │
│   Server    │    │    Layer    │    │    Model    │    │     DB      │
└─────────────┘    └─────────────┘    └─────────────┘    └─────────────┘

Server Layer: FastAPI REST API for handling requests
Retriever Layer: LangChain retriever for document fetching
Embedding Layer: OpenAI embeddings for vector conversion
Vector Store: Pinecone vector database for similarity search

Installation

1. Clone the repository

git clone https://github.com/your-username/retriever-me.git
cd retriever-me

2. Set up Conda environment

# Create a new conda environment
conda create -n retrieval-pipeline python=3.12 -y

# Activate the environment
conda activate retrieval-pipeline

# Install pip inside the conda environment (if needed)
conda install pip -y

3. Install dependencies

pip install -r requirements.txt

4. Set up environment variables

Create a .env file in the project root directory by copying the example file:

cp .env.example .env

Then edit the .env file to add your API keys:

# API Keys (required)
OPENAI_API_KEY=your_openai_api_key_here
PINECONE_API_KEY=your_pinecone_api_key_here

# Pinecone Settings
PINECONE_INDEX=your_pinecone_index_name
PINECONE_NAMESPACE=your_pinecone_namespace

Configuration

Configuration settings are managed in config/settings.py. Key settings include:

EMBEDDING_MODEL: Model used for text embeddings (default: "text-embedding-3-small")
DEFAULT_TOP_K: Default number of results to return (default: 3)
DEFAULT_SCORE_THRESHOLD: Minimum similarity score threshold (default: 0.4)
NAMESPACE: Pinecone namespace for data partitioning

Usage

Starting the server

Run the server locally:

# Make sure your conda environment is activated
conda activate retrieval-pipeline

# Start the server
python server.py

The server will start on http://localhost:8000.

For network access:

# Find your IP address
ip addr show

# Access from other computers using
http://your-ip-address:8000

API Endpoints

Health Check

GET /health

Query Documents

POST /query

Request body:

{
  "query": "What is machine learning?",
  "top_k": 3,
  "threshold": 0.4,
  "filter": {
    "category": "technology",
    "source": "articles"
  }
}

Response:

{
  "request_id": "d13ef401-6a01-4fbc-a4d2-a84815c8e83b",
  "query": "What is machine learning?",
  "documents": [
    {
      "content": "Machine learning is a subset of artificial intelligence that provides systems the ability to automatically learn and improve from experience without being explicitly programmed.",
      "metadata": {
        "category": "technology",
        "source": "articles",
        "author": "tech_writer",
        "url": "https://example.com/tech/machine-learning"
      }
    }
  ],
  "took_ms": 345
}

Project Structure

server.py: Main FastAPI application
main.py: CLI test script
config/: Configuration settings and logging
embeddings/: Embedding model implementations
retriever/: Retriever implementations
vectorstore/: Vector database connectors
utils/: Utility functions and metrics
logs/: Application logs directory

Logging

The application uses a structured logging system for tracking operations and debugging:

Log files are stored in the logs/ directory
Default log file: logs/pipeline.log
Logs include:
- Request details (query, parameters, request ID)
- Retrieval statistics (time taken, number of documents)
- API operations
- Errors and exceptions with tracebacks
- Server startup/shutdown events

Logging levels can be adjusted in config/logger.py based on your needs (DEBUG, INFO, WARNING, ERROR).

Example log entry:

[2025-05-22 14:47:40] [INFO] [api_server] Request 0cdda49f-1412-4cd7-812e-e5654d2b1a22: Query received: 'What is machine learning?' (top_k: 3, threshold: 0.4)
[2025-05-22 14:47:43] [INFO] [api_server] Request 0cdda49f-1412-4cd7-812e-e5654d2b1a22: Retrieved 3 documents in 3136ms

Development

To run the server in development mode with auto-reload:

# Activate conda environment
conda activate retrieval-pipeline

# Run with auto-reload
uvicorn server:app --reload

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Retriever-Me: Vector Search API

Overview

Architecture

Installation

1. Clone the repository

2. Set up Conda environment

3. Install dependencies

4. Set up environment variables

Configuration

Usage

Starting the server

API Endpoints

Health Check

Query Documents

Project Structure

Logging

Development

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
config		config
embeddings		embeddings
retriver		retriver
utils		utils
vectorstore		vectorstore
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
server.py		server.py

ron-42/retriever-me

Folders and files

Latest commit

History

Repository files navigation

Retriever-Me: Vector Search API

Overview

Architecture

Installation

1. Clone the repository

2. Set up Conda environment

3. Install dependencies

4. Set up environment variables

Configuration

Usage

Starting the server

API Endpoints

Health Check

Query Documents

Project Structure

Logging

Development

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages