Fast, concurrent URL validation for your repositories 🚀
GitHub Action that validates URL availability in your files using the powerful urlsup Rust binary. Perfect for catching broken links in documentation, ensuring all URLs return successful HTTP status codes.
There's plenty of examples in the examples/ directory, including real-world configurations and pre-built templates that you can use to quickly set up URL validation in your workflows.
- ✨ Features
- 🚀 Quick Start
- 📋 Inputs
- 📤 Outputs
- 📖 Usage Examples
⚠️ Common Use Cases- ❓ FAQ
- 📚 Documentation & Examples
- 🔧 Migration from v1
- 🎯 GitHub Integration
- 🔧 Internal Scripts
- 🧪 Testing & Development
- 🔗 Related
- 📄 License
- ⚡ Lightning Fast: Composite action with binary caching (5-10x faster than Docker)
- 🔄 Concurrent: Multi-threaded URL checking with configurable concurrency
- 🎯 Smart Filtering: Allowlists, status code filtering, and regex exclusions
- 📊 Rich Reports: GitHub annotations, job summaries, and detailed JSON reports
- 🔧 Highly Configurable: 20+ inputs mapping to all urlsup features
name: Validate URLs are up
on:
push:
pull_request:
schedule:
- cron: '0 9 * * 1' # Weekly on Monday
jobs:
url-validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Validate URLs
id: validate-urls
uses: simeg/urlsup-action@v2
with:
files: '.'
recursive: true
timeout-seconds: 5
retry: 2💡 Performance Note: The action automatically caches the urlsup binary for lightning-fast subsequent runs (5-10x speedup after first execution).
| Input | Description | Default |
|---|---|---|
files |
Files or directories to check (space-separated) | '.' |
recursive |
Recursively process directories | true |
include-extensions |
File extensions to process (comma-separated) | 'md,rst,txt,html' |
| Input | Description | Default |
|---|---|---|
timeout-seconds |
Connection timeout in seconds | 5 |
concurrency |
Number of concurrent requests | # CPU cores |
retry |
Retry attempts for failed requests | 2 |
retry-delay-ms |
Delay between retries in milliseconds | 1000 |
rate-limit-ms |
Delay between requests in milliseconds | 100 |
| Input | Description | Default |
|---|---|---|
allowlist |
URLs to allow (comma-separated patterns) | |
allow-status |
HTTP status codes to allow (comma-separated) | '200,202,204' |
exclude-pattern |
URL patterns to exclude (regex) | |
allow-timeout |
Allow URLs that timeout | false |
failure-threshold |
Fail only if more than X% of URLs are broken (0-100). Leave empty to fail on any broken URL (default). | '' |
| Input | Description | Default |
|---|---|---|
quiet |
Suppress progress output | false |
verbose |
Enable verbose logging | false |
| Input | Description | Default |
|---|---|---|
user-agent |
Custom User-Agent header | 'urlsup-action/{urlsup-version}' |
proxy |
HTTP/HTTPS proxy URL | |
insecure |
Skip SSL certificate verification | false |
| Input | Description | Default |
|---|---|---|
urlsup-version |
Version of urlsup to use | 'latest' |
create-annotations |
Create GitHub annotations for broken URLs | true |
fail-on-error |
Fail the action if broken URLs are found | true |
show-performance |
Show performance metrics in job summaries | false |
telemetry |
Enable anonymous performance telemetry and metrics in job summaries | true |
| Output | Description |
|---|---|
total-urls |
Total number of URLs checked |
broken-urls |
Number of broken URLs found |
success-rate |
Percentage of working URLs |
report-path |
Path to detailed JSON report |
exit-code |
Exit code from urlsup (0 = success) |
- name: Check all markdown files
id: validate-urls
uses: simeg/urlsup-action@v2
with:
files: '**/*.md'
include-extensions: 'md'- name: Check URLs with custom settings
id: validate-urls
uses: simeg/urlsup-action@v2
with:
files: 'docs/ README.md CHANGELOG.md'
timeout-seconds: 15
retry: 3
concurrency: 20
allow-status: '200,202,204,404'
exclude-pattern: 'localhost|127\.0\.0\.1|example\.com'
allowlist: 'github.com,docs.github.com'
user-agent: 'MyBot/1.0'- name: Check URLs (non-blocking)
id: validate-urls
uses: simeg/urlsup-action@v2
with:
files: 'docs/'
fail-on-error: false
create-annotations: truename: URL Validation
on:
push:
branches: [main]
pull_request:
schedule:
- cron: '0 9 * * 1' # Weekly
workflow_dispatch:
inputs:
files:
description: 'Files to check'
default: '**/*.md'
strict:
description: 'Strict mode (fail on any broken URL)'
type: boolean
default: true
permissions:
contents: read
jobs:
validate-urls:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Validate URLs
id: validate-urls
uses: simeg/urlsup-action@v2
with:
files: ${{ inputs.files || '**/*.md' }}
timeout-seconds: 5
retry: 2
rate-limit-ms: 100
allow-status: ${{ inputs.strict && '200' || '200,202,204' }}
failure-threshold: ${{ inputs.strict && '' || '3' }} # Allow 3% broken URLs in non-strict mode
show-performance: true # Show detailed metrics
fail-on-error: ${{ inputs.strict || true }}
- name: Comment on PR
if: github.event_name == 'pull_request' && failure()
uses: actions/github-script@v7
with:
script: |
const { data: comments } = await github.rest.issues.listComments({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
});
const botComment = comments.find(comment =>
comment.user.type === 'Bot' && comment.body.includes('URL validation failed')
);
const body = '🔗 **URL validation failed** - Some links in your changes are broken. Please check the workflow run for details.';
if (botComment) {
await github.rest.issues.updateComment({
owner: context.repo.owner,
repo: context.repo.repo,
comment_id: botComment.id,
body
});
} else {
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
body
});
}- name: Validate documentation URLs
id: validate-urls
uses: simeg/urlsup-action@v2
with:
files: 'docs/ *.md'
include-extensions: 'md,rst'
allow-status: '200,202'
exclude-pattern: 'localhost|127\.0\.0\.1'- name: Validate API documentation URLs
id: validate-urls
uses: simeg/urlsup-action@v2
with:
files: 'api-docs/'
timeout-seconds: 60
retry: 3
allowlist: 'api.example.com,docs.example.com'# Traditional lenient approach (never fails)
- name: Non-blocking URL validation
id: validate-urls
uses: simeg/urlsup-action@v2
with:
allow-status: '200,202,204,301,302,429'
allow-timeout: true
fail-on-error: false
# Modern threshold approach (fails only if too many URLs are broken)
- name: URL validation with 10% tolerance
id: validate-urls
uses: simeg/urlsup-action@v2
with:
failure-threshold: "10" # Allow up to 10% broken URLs
allow-status: '200,202,204,301,302,429'
show-performance: true # Track performance metrics
retry: 2# Allow some broken URLs with detailed performance tracking
- name: Validate URLs with tolerance and metrics
id: validate-urls
uses: simeg/urlsup-action@v2
with:
files: 'docs/ README.md'
failure-threshold: "5" # Allow up to 5% broken URLs
show-performance: true # Show detailed performance metrics
timeout-seconds: 10
retry: 3
rate-limit-ms: 1500 # Be gentle with external sites
allow-status: '200,202,204,429' # Include rate-limited responses- Examples Directory - Real-world workflow examples and configurations
- Configuration Guide - Pre-built configuration templates
- Changelog - Version history and migration guides
- Testing Guide - Testing infrastructure and contribution guidelines
A: After the first run, the action is 5-10x faster due to automatic binary caching. First run may take 1-2 minutes (compiling urlsup), subsequent runs take 10-20 seconds.
A: No! The action automatically handles binary caching for you. No additional configuration needed.
A: By default: Markdown (.md), reStructuredText (.rst), plain text (.txt), and HTML (.html). You can customize this with the include-extensions input.
A: Use the exclude-pattern input with a regex pattern:
exclude-pattern: 'localhost|127\.0\.0\.1|example\.com|internal\.company\.com'A: Yes, but they need to be accessible from GitHub Actions runners. For private URLs, consider using allowlist or exclude-pattern to skip them.
A: This is common! Here's how to debug and fix:
Common causes:
- Rate limiting: Some sites block automated requests
- User-Agent blocking: Try setting a custom
user-agent - Geoblocking: GitHub runners are in different locations
- Authentication required: URLs requiring login will fail
Debugging steps:
- Check the annotations - They now include specific suggestions for each URL
- Test with a custom user-agent:
user-agent: 'Mozilla/5.0 (compatible; Documentation Bot)'
- Add rate limiting:
rate-limit-ms: 2000 # 2 seconds between requests retry: 3
- Allow common "false positive" status codes:
allow-status: '200,202,204,403,429' # Include 403 (Forbidden) and 429 (Rate Limited)
A: Use these inputs to be more respectful:
rate-limit-ms: 1000 # 1 second between requests
retry: 3 # Retry failed requests
allow-status: '200,429' # Accept 429 (Too Many Requests)A: Yes! Perfect for monitoring link rot:
on:
schedule:
- cron: '0 9 * * 1' # Every Monday at 9 AMA: The action now provides actionable suggestions in annotations. Here are common scenarios:
GitHub URLs failing:
# GitHub often rate limits, be gentle
rate-limit-ms: 1500
retry: 2
allow-status: '200,429' # Allow rate limit responsesAPI documentation with auth:
# Skip authenticated endpoints
exclude-pattern: 'api\.internal\.com|admin\.|\/auth\/'International/CDN sites:
# Some CDNs are geographically restricted
timeout-seconds: 10 # Increase timeout
allow-status: '200,403' # Allow forbidden for geo-blockingDevelopment/staging URLs:
# Exclude development environments
exclude-pattern: 'localhost|127\.0\.0\.1|dev\.|staging\.|\.local'A: Use the failure-threshold parameter to only fail the action when the percentage of broken URLs exceeds a specific threshold:
# Allow up to 5% of URLs to be broken
- name: Validate URLs with tolerance
id: validate-urls
uses: simeg/urlsup-action@v2
with:
failure-threshold: "5" # Fail only if >5% of URLs are broken
files: 'docs/'Common use cases:
- Documentation sites: Allow 2-5% broken links for external dependencies
- Large repositories: Set 1-3% threshold for legacy or external URLs
- CI/CD pipelines: Use higher thresholds (10-20%) for non-critical checks
How it works:
- If 100 URLs are found and 3 are broken (3%), action passes with
failure-threshold: "5" - If 100 URLs are found and 8 are broken (8%), action fails with
failure-threshold: "5" - Threshold information is displayed in job summaries with clear pass/fail status
- Leave empty for default behavior (any broken URL fails the action)
A: Performance metrics are automatically enabled by default and appear in GitHub job summaries when telemetry: true (default). The metrics include:
- Setup Time - How long it took to install/cache the urlsup binary
- Validation Time - Duration of URL checking process
- Cache Status - Whether the binary was cached (✅ Hit) or downloaded fresh (❌ Miss)
To explicitly enable performance metrics:
- name: Validate URLs with telemetry
id: validate-urls
uses: simeg/urlsup-action@v2
with:
telemetry: true # Enable performance tracking (default: true)To disable performance metrics:
- name: Validate URLs without telemetry
id: validate-urls
uses: simeg/urlsup-action@v2
with:
telemetry: false # Disable performance trackingA: You can use both features together for comprehensive URL validation:
- name: Validate URLs with threshold and metrics
id: validate-urls
uses: simeg/urlsup-action@v2
with:
files: 'docs/ *.md'
failure-threshold: "3" # Allow up to 3% broken URLs
show-performance: true # Show detailed performance metrics
retry: 2 # Retry failed URLs twice
rate-limit-ms: 1000 # Be gentle with rate limitingThis configuration will:
- ✅ Show detailed performance metrics in the job summary
- ✅ Display failure threshold status (3% tolerance)
- ✅ Only fail if more than 3% of URLs are broken
- ✅ Provide actionable recommendations for broken URLs
A: Report bugs or feature requests on GitHub Issues
v1 (Docker-based):
- uses: simeg/urlsup-action@v1
with:
args: '*.md --threads 10 --allow 429'v2 (Composite with binary caching, 5-10x faster):
- uses: simeg/urlsup-action@v2
with:
files: '*.md'
concurrency: 10
allow-status: '200,429'- 🚀 Binary Caching: Automatic caching of urlsup binary across workflow runs
- ⚡ Faster Startup: 5-10x faster than Docker-based v1 (seconds vs minutes)
- 🔄 Smart Cache Keys: Version and platform-specific caching for reliability
argsinput removed → Use structured inputs- Default timeout changed from 10s → 5s
- Now creates annotations by default
- Requires
actions/checkout@v4
Broken URLs appear as inline annotations in your files:
❌ example.md:15 Broken URL: https://example.com/dead-link (HTTP 404)
Rich HTML summaries with:
- 📊 Success rate visualization
- 📋 Broken URL details table
- 💡 Actionable recommendations
- 📁 Downloadable JSON reports
Detailed JSON reports are uploaded as workflow artifacts containing:
- Complete URL validation results
- File locations and line numbers
- HTTP status codes and error messages
- Timing and performance metrics
The action uses several Python scripts located in the scripts/ directory that handle the core functionality:
The main validation script that orchestrates the URL checking process.
Key features:
- Translates GitHub Action inputs to urlsup CLI arguments
- Executes urlsup binary with proper error handling
- Parses JSON output to extract metrics (total URLs, broken URLs, success rate)
- Sets GitHub Action outputs for use in subsequent steps
- Handles both successful and failed validation scenarios
Creates GitHub annotations for broken URLs found during validation.
Key features:
- Parses urlsup JSON output to identify broken URLs
- Creates inline file annotations showing broken URLs with line numbers
- Supports multiple urlsup output formats for backward compatibility
- Formats error messages with HTTP status codes and detailed error information
- Gracefully handles parsing errors with fallback methods
Generates rich HTML job summaries for the GitHub Actions interface.
Key features:
- Creates formatted job summary with success metrics and visual progress bars
- Displays detailed broken URL information in organized tables
- Provides actionable recommendations for fixing different types of issues
- Includes expandable sections with technical details and metadata
- Handles both successful runs and error scenarios
Shared utilities and helper functions used across all scripts.
Key features:
- Centralized logging with consistent formatting and colors
- JSON report parsing with support for multiple urlsup output formats
- File path normalization and GitHub workspace handling
- Markdown escaping for safe display in job summaries
- GitHub Actions integration utilities
The action also includes inline setup logic that:
- Installs the Rust toolchain if needed
- Downloads and installs the urlsup binary via
cargo install - Handles version pinning and caching through GitHub's built-in mechanisms
These components work together to provide a seamless URL validation experience with rich GitHub integration, automatic binary management, and comprehensive error reporting.
This action includes comprehensive testing infrastructure:
- Unit Tests - Full coverage of Python scripts with pytest
- End-to-End Tests - Real-world validation scenarios with generated test data
- CI Pipeline - Multi-platform testing across Python versions
- Example Workflows - Real-world configurations in
examples/ - Configuration Templates - Pre-built configs for common scenarios
# Install Poetry (if not already installed)
curl -sSL https://install.python-poetry.org | python3 -
# Install development dependencies
make install
# Run tests
make test
# Check code quality
make lint
# Format code
make format
# Run full CI simulation
make ci-localSee TESTING.md for detailed testing documentation.
- urlsup - The underlying Rust CLI tool
- Actions Marketplace - Find this action
- GitHub Actions Documentation - Learn more about workflows
MIT © Simon Egersand