RAG Document Chatbot

A simple Retrieval Augmented Generation (RAG) chatbot that allows users to upload multiple documents and chat with them. The application uses Gradio for the frontend interface and Pinecone as the vector database.

Features

Upload multiple documents (PDF, TXT, DOCX)
Extract text from documents and store as vector embeddings
Chat with your documents using natural language
Responses include source information

Prerequisites

Docker installed on your machine
OpenAI API key
Pinecone API key

Setup

Clone this repository:

git clone <repository-url>
cd <repository-directory>

Create a .env file by copying the example:
```
cp .env.example .env
```

Edit the .env file and fill in your API keys:

OPENAI_API_KEY=your_openai_api_key_here
PINECONE_API_KEY=your_pinecone_api_key_here
PINECONE_ENVIRONMENT=your_pinecone_environment_here
PINECONE_INDEX_NAME=your_pinecone_index_name_here

Running with Docker

Option 1: Using Docker Compose (Recommended)

docker-compose up -d

This will build the image and start the container in the background. You can view the logs with:

docker-compose logs -f

Option 2: Using Docker directly

Build the Docker image:
```
docker build -t rag-document-chatbot .
```

Run the container:

docker run -p 7860:7860 --env-file .env rag-document-chatbot

Access the application in your browser at:
```
http://localhost:7860
```

Usage

Go to the "Upload Documents" tab.
Upload one or more documents (PDF, TXT, DOCX).
Click "Process Documents" and wait for confirmation.
Switch to the "Chat" tab.
Ask questions about your documents.

How It Works

Document Processing: Uploaded documents are processed, chunked, and embedded.
Vector Storage: Document chunks are stored in Pinecone as vector embeddings.
Retrieval: When you ask a question, the system finds the most relevant document chunks.
Generation: The language model generates a response based on the retrieved context.

Technical Stack

Python
Gradio (Frontend)
LangChain (RAG framework)
OpenAI (Embeddings and LLM)
Pinecone (Vector Database)
Docker (Containerization)

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RAG Document Chatbot

Features

Prerequisites

Setup

Running with Docker

Option 1: Using Docker Compose (Recommended)

Option 2: Using Docker directly

Usage

How It Works

Technical Stack

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

amathur2k/rag-chatbot

Folders and files

Latest commit

History

Repository files navigation

RAG Document Chatbot

Features

Prerequisites

Setup

Running with Docker

Option 1: Using Docker Compose (Recommended)

Option 2: Using Docker directly

Usage

How It Works

Technical Stack

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages