Skip to content

dylan-gluck/localdocs-mcp

Repository files navigation

LocalDocs MCP

A Model Context Protocol (MCP) server that creates a local database of indexed and optimized technical documentation. It enables AI agents to efficiently query, search, and retrieve documentation from both web sources and local files through MCP tools.

Features

  • Web Crawling: Automatically crawl and index documentation websites
  • Local File Indexing: Process local markdown documentation
  • AI-Powered Processing: Optional AI enhancement for metadata extraction and example generation
  • Smart Search: Fuzzy search and semantic retrieval capabilities
  • Efficient Storage: Folder-based markdown storage with frontmatter metadata
  • MCP Integration: Full MCP protocol support for AI agent interaction
  • Async Architecture: Fast, concurrent processing throughout

Installation

# Install from source
git clone https://github.com/dylan-gluck/localdocs-mcp
cd localdocs-mcp
uv sync

# Run directly with uvx (coming soon)
# uvx localdocs-mcp

Quick Start

1. Initialize a Documentation Collection

# Crawl web documentation
localdocs init react --crawl https://react.dev/learn --depth 2

# Index local files
localdocs init myproject --local ~/Documents/myproject/docs

# With AI processing (requires OpenAI API key)
localdocs init vue --crawl https://vuejs.org/guide/ --ai

2. Search Documentation

# Search across all collections
localdocs search "useState hook"

# Search specific collection
localdocs search "component props" --collection react

# List all collections
localdocs list

3. Configure MCP Client

Add to your Claude Desktop configuration (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "localdocs": {
      "command": "uvx",
      "args": ["localdocs-mcp", "serve"],
      "env": {
        "OPENAI_API_KEY": "${OPENAI_API_KEY}"  // Optional, for AI processing
      }
    }
  }
}

MCP Tools

The server exposes the following tools to AI agents:

Tool Description Parameters
search_docs Search across all documentation query, collection?, limit?
list_collections List available collections -
get_document Get specific document by ID doc_id
list_examples List code examples collection?, language?
fuzzy_find Fuzzy search documents pattern, collection?

CLI Commands

Collection Management

# Initialize new collection
localdocs init <name> --crawl <url> [--depth N] [--ai]
localdocs init <name> --local <path> [--ai]

# List collections
localdocs list

# Update existing collection
localdocs update <name>

# Delete collection
localdocs delete <name>

Document Operations

# Search documents
localdocs search <query> [--collection NAME] [--limit N]

# Show specific document
localdocs show <doc-id>

# Get statistics
localdocs stats [--collection NAME]

MCP Server

# Start MCP server (stdio transport)
localdocs serve

# Start with HTTP transport (coming soon)
localdocs serve --port 8080

Configuration

LocalDocs stores configuration in ~/.localdocs-mcp/config.yaml:

storage_path: ~/.localdocs-mcp
default_collection: main

crawl_defaults:
  depth: 2
  word_count_threshold: 50
  excluded_tags: [nav, footer, header]
  cache_enabled: true

processing:
  chunk_size: 2000
  overlap: 200
  generate_examples: true
  
baml:
  model: gpt-4o-mini
  temperature: 0.3

Development

# Install dependencies
uv sync

# Run tests
uv run pytest tests/

# Run specific test file
uv run pytest tests/test_storage.py -v

# Lint and format code
uvx ruff check .
uvx ruff format .

# Type checking
uv run mypy localdocs

Architecture

LocalDocs follows a modular architecture:

  • CLI Layer: Typer-based command interface
  • Processing Layer: Web crawling (Crawl4ai) and document processing
  • Storage Layer: File-based storage with markdown and frontmatter
  • MCP Layer: FastMCP server implementation
  • AI Layer: Optional BAML integration for enhanced processing

Storage Format

Documents are stored as markdown files with YAML frontmatter:

---
id: "uuid-here"
collection: "react"
source_url: "https://react.dev/learn/thinking-in-react"
title: "Thinking in React"
chunk: 1
total_chunks: 3
tags: ["react", "component", "state"]
created: 2025-09-04
examples_generated: true
---

# Thinking in React (Part 1/3)

[Document content here]

## Generated Examples

[Example code blocks]

Environment Variables

  • OPENAI_API_KEY: Required for AI-powered processing features
  • ANTHROPIC_API_KEY: Alternative AI provider for processing
  • LOCALDOCS_PATH: Override default storage path

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT License - see LICENSE file for details

Roadmap

  • Vector embeddings for semantic search
  • Support for more file types (PDF, docx)
  • HTTP transport option for MCP
  • Incremental indexing
  • Web UI for document browsing
  • Custom BAML prompts
  • Multi-language code detection improvements

Acknowledgments

Built with:

About

Local database of indexed and optimized technical documentation

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages