RAGnet is an AI-powered document question-answering system that uses Retrieval Augmented Generation (RAG) to provide accurate responses based on the content of uploaded PDF documents.
- PDF document ingestion and processing
- Semantic chunking for optimal text splitting
- Vector storage using Qdrant for efficient retrieval
- Hugging Face language model integration (google/gemma-2-27b quantized to fp4)
- Reranking and chain filtering options for improved accuracy
- Session-based chat history management
-
Clone the repository:
git clone https://github.com/Vinnu124/RAGnet cd RAGnet -
Install dependencies:
pip install -r requirements.txt
-
Set up environment variables:
- Get the QDrant URL to your cluster using
https://qdrant.to/cloud - Create a
.envfile in the project root which will contain the QDrant Link. - Add
QDRANT_URLvariable pointing to the project directory.
- Get the QDrant URL to your cluster using
Run the main application:
python main.py path/to/pdf1.pdf path/to/pdf2.pdfTo exit the application, type exit or quit.
Adjust settings in config.py:
- Model parameters (embeddings, reranker, language model)
- Retriever settings (use of reranker, chain filter)
- Database and file paths
main.py: Entry point of the applicationconfig.py: Configuration settingsingestor.py: Document processing and vector store creationretriever.py: Document retrieval logicmodel.py: Language model and embedding configurationschain.py: Question-answering chain setupsession_history.py: Chat history management
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.