- Node.js — version 20 or higher
- Ollama — Download here
- LLM — gemma3:4b, or any other model supported by your system specifications
- Embedding Model — dengcao/Qwen3-Embedding-0.6B:Q8_0
- Minimum Free RAM — 8 GB
- Nextjs
- Langchain
- Perplexity API
- Ollama
- Pinecone
- Upstash (Redis)
- Chatbot — Talk directly with your chosen LLM.
- Chat with Document — Perform RAG-based queries using your uploaded files.
- Chat with URL (coming soon) — Interact with web content directly.
- Chat with Perplexity API for up-to-date responses.
- Chat with local LLMs via Ollama (offline).
- Use RAG (Retrieval-Augmented Generation) to chat with your documents.
- Simple UI to switch between:
- Chat with LLM
- Chat with Document
- Chat with URL (coming soon)
- Namespace-based vector separation for files.
- Contextual memory for persistent conversations.
- Middleware to limit API requests and prevent overuse.
PERPLEXITY_API_KEY=""
NEXT_PUBLIC_HOST="http://localhost:3000"
EMBEDDING_MODEL="dengcao/Qwen3-Embedding-0.6B:Q8_0" # Ollama
OLLAMA_BASE_URL="http://localhost:11434"
OLLAMA_MODEL="gemma3:4b"
PINECONE_KEY=""
UPSTASH_REDIS_URL="" # for rate-limiter
UPSTASH_REDIS_TOKEN="" # for rate-limiterWe appriciate contributions! Please follow CONTRIBUTING.md for more infromation.
- Next.js — for an amazing framework
- LangChain.js — for simplifying complex LLM integrations and RAG workflows.
- Ollama — for letting me run LLMs on my half-alive laptop!
- Perplexity API — for sparking the idea of building an RAG application.