Repository Intelligence Tools - Analyze codebases with LLM assistance
Repo Intel is a comprehensive toolkit for analyzing repositories, generating documentation, and performing AI-assisted code reviews. It provides three main tools:
- Git Diff Analyzer - Compare branches with LLM-powered code reviews
- Markdown Bundler - Combine code files into LLM-friendly markdown bundles
- Glue Documenter - Generate documentation for AWS Glue databases
pip install repo-intel# Analyze differences between branches
repo-intel diff-analyze main staging
# Create a markdown bundle of your codebase
repo-intel markdown-bundle src/ -o codebase.md
# Document AWS Glue databases
repo-intel glue-document -d my_database
# List available LLM providers
repo-intel list-providers- File-by-file analysis that breaks down large diffs into manageable chunks
- LLM integration with OpenAI, Anthropic, and local models
- Risk assessment with automatic priority ranking
- Smart filtering to skip oversized or binary files
- Comprehensive reporting with both summary and detailed analysis
- Flexible file inclusion with configurable extensions
- Smart exclusion patterns for common directories (node_modules, .git, etc.)
- Organized output with table of contents and proper formatting
- Markdown-only mode for documentation bundling
- Complete AWS Glue documentation for databases and tables
- Schema documentation with column details and types
- Metadata extraction including creation dates and parameters
- Flexible filtering to exclude specific databases or tables
Repo Intel uses environment variables for configuration. Create a .env file or set these in your environment:
# Choose your LLM provider
LLM_PROVIDER=openai # 'openai', 'anthropic', 'local', or leave empty for auto-select
# OpenAI settings
OPENAI_API_KEY=your_openai_api_key
OPENAI_MODEL=gpt-4
OPENAI_TEMPERATURE=0.3
# Anthropic settings
ANTHROPIC_API_KEY=your_anthropic_api_key
ANTHROPIC_MODEL=claude-3-sonnet-20240229
# Local LLM settings (Ollama, etc.)
LOCAL_LLM_BASE_URL=http://localhost:11434
LOCAL_LLM_MODEL=codellamaAWS_REGION=us-west-2
AWS_PROFILE=default# Default output directory
OUTPUT_DEFAULT_DIR=repo_intel_output
# Maximum file size for LLM analysis (bytes)
LLM_DEFAULT_MAX_FILE_SIZE=250000
# Enable verbose logging
OUTPUT_VERBOSE=true# Basic branch comparison
repo-intel diff-analyze main feature/new-api
# Custom output directory and file size limit
repo-intel diff-analyze main staging \
--output-dir my_review \
--max-file-size 500000
# Skip LLM analysis for faster processing
repo-intel diff-analyze main staging --no-llm
# Force specific LLM provider
repo-intel diff-analyze main staging --llm-provider anthropic# Bundle all code files
repo-intel markdown-bundle src/
# Bundle only markdown files
repo-intel markdown-bundle docs/ --markdown-only
# Exclude additional patterns
repo-intel markdown-bundle . --exclude __pycache__ .pytest_cache
# Custom output file
repo-intel markdown-bundle src/ -o my_codebase.md# Document all databases
repo-intel glue-document
# Document specific database
repo-intel glue-document -d my_database
# Use specific AWS profile and region
repo-intel glue-document --profile prod --region us-east-1
# Exclude specific databases or tables
repo-intel glue-document \
--exclude-databases temp_db test_db \
--exclude-tables temp_tablerepo-intel/
βββ src/repo_intel/
β βββ __init__.py # Package version and metadata
β βββ cli.py # Main CLI interface
β βββ diff.py # Git diff analyzer
β βββ llm.py # LLM provider integrations
β βββ markdown_bundle.py # Markdown bundler
β βββ glue_bundle.py # AWS Glue documenter
β βββ settings.py # Configuration management
βββ tests/ # Test suite
βββ docs/ # Documentation
βββ setup.py # Package setup
βββ pyproject.toml # Modern Python packaging
βββ requirements.txt # Dependencies
βββ README.md # This file
- Get an API key from OpenAI
- Set
OPENAI_API_KEYenvironment variable - Choose your model (gpt-4, gpt-3.5-turbo, etc.)
- Get an API key from Anthropic
- Set
ANTHROPIC_API_KEYenvironment variable - Use Claude models for analysis
- Install Ollama
- Pull a code model:
ollama pull codellama - Start service:
ollama serve - Configure
LOCAL_LLM_BASE_URLandLOCAL_LLM_MODEL
repo_intel_output/
βββ README.md # Summary with risk-ranked files
βββ summary.json # Machine-readable summary
βββ detailed_analysis.json # Complete analysis data
βββ files/ # Individual file reports
βββ src_main.py.md
βββ api_routes.py.md
βββ config_settings.py.md
Files are automatically prioritized:
- Critical (8-10): Core changes requiring immediate attention
- High (6-7): Important changes needing careful review
- Medium (4-5): Standard changes requiring normal review
- Low (1-3): Minor changes needing quick review
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
- Issues: GitHub Issues
- Documentation: GitHub README
- Discussions: GitHub Discussions
- Support for more LLM providers (HuggingFace, local models)
- Integration with popular code review tools
- Advanced filtering and configuration options
- Web UI for report viewing
- CI/CD integration examples
- Plugin system for custom analyzers
Made with β€οΈ for developers who love clean, well-analyzed code