A Model Context Protocol (MCP) server that creates a local database of indexed and optimized technical documentation. It enables AI agents to efficiently query, search, and retrieve documentation from both web sources and local files through MCP tools.
- Web Crawling: Automatically crawl and index documentation websites
- Local File Indexing: Process local markdown documentation
- AI-Powered Processing: Optional AI enhancement for metadata extraction and example generation
- Smart Search: Fuzzy search and semantic retrieval capabilities
- Efficient Storage: Folder-based markdown storage with frontmatter metadata
- MCP Integration: Full MCP protocol support for AI agent interaction
- Async Architecture: Fast, concurrent processing throughout
# Install from source
git clone https://github.com/dylan-gluck/localdocs-mcp
cd localdocs-mcp
uv sync
# Run directly with uvx (coming soon)
# uvx localdocs-mcp# Crawl web documentation
localdocs init react --crawl https://react.dev/learn --depth 2
# Index local files
localdocs init myproject --local ~/Documents/myproject/docs
# With AI processing (requires OpenAI API key)
localdocs init vue --crawl https://vuejs.org/guide/ --ai# Search across all collections
localdocs search "useState hook"
# Search specific collection
localdocs search "component props" --collection react
# List all collections
localdocs listAdd to your Claude Desktop configuration (~/Library/Application Support/Claude/claude_desktop_config.json):
{
"mcpServers": {
"localdocs": {
"command": "uvx",
"args": ["localdocs-mcp", "serve"],
"env": {
"OPENAI_API_KEY": "${OPENAI_API_KEY}" // Optional, for AI processing
}
}
}
}The server exposes the following tools to AI agents:
| Tool | Description | Parameters |
|---|---|---|
search_docs |
Search across all documentation | query, collection?, limit? |
list_collections |
List available collections | - |
get_document |
Get specific document by ID | doc_id |
list_examples |
List code examples | collection?, language? |
fuzzy_find |
Fuzzy search documents | pattern, collection? |
# Initialize new collection
localdocs init <name> --crawl <url> [--depth N] [--ai]
localdocs init <name> --local <path> [--ai]
# List collections
localdocs list
# Update existing collection
localdocs update <name>
# Delete collection
localdocs delete <name># Search documents
localdocs search <query> [--collection NAME] [--limit N]
# Show specific document
localdocs show <doc-id>
# Get statistics
localdocs stats [--collection NAME]# Start MCP server (stdio transport)
localdocs serve
# Start with HTTP transport (coming soon)
localdocs serve --port 8080LocalDocs stores configuration in ~/.localdocs-mcp/config.yaml:
storage_path: ~/.localdocs-mcp
default_collection: main
crawl_defaults:
depth: 2
word_count_threshold: 50
excluded_tags: [nav, footer, header]
cache_enabled: true
processing:
chunk_size: 2000
overlap: 200
generate_examples: true
baml:
model: gpt-4o-mini
temperature: 0.3# Install dependencies
uv sync
# Run tests
uv run pytest tests/
# Run specific test file
uv run pytest tests/test_storage.py -v
# Lint and format code
uvx ruff check .
uvx ruff format .
# Type checking
uv run mypy localdocsLocalDocs follows a modular architecture:
- CLI Layer: Typer-based command interface
- Processing Layer: Web crawling (Crawl4ai) and document processing
- Storage Layer: File-based storage with markdown and frontmatter
- MCP Layer: FastMCP server implementation
- AI Layer: Optional BAML integration for enhanced processing
Documents are stored as markdown files with YAML frontmatter:
---
id: "uuid-here"
collection: "react"
source_url: "https://react.dev/learn/thinking-in-react"
title: "Thinking in React"
chunk: 1
total_chunks: 3
tags: ["react", "component", "state"]
created: 2025-09-04
examples_generated: true
---
# Thinking in React (Part 1/3)
[Document content here]
## Generated Examples
[Example code blocks]OPENAI_API_KEY: Required for AI-powered processing featuresANTHROPIC_API_KEY: Alternative AI provider for processingLOCALDOCS_PATH: Override default storage path
Contributions are welcome! Please feel free to submit a Pull Request.
MIT License - see LICENSE file for details
- Vector embeddings for semantic search
- Support for more file types (PDF, docx)
- HTTP transport option for MCP
- Incremental indexing
- Web UI for document browsing
- Custom BAML prompts
- Multi-language code detection improvements
Built with: