VideoCraft

A secure, AI-powered video generation platform that creates videos from JSON configurations using FFmpeg and OpenAI Whisper.

What VideoCraft Does

VideoCraft transforms JSON configurations into complete videos with:

Automatic scene composition from multiple media sources (video, audio, images)
AI-powered progressive subtitles with word-level timing precision
Security-first architecture with comprehensive input validation and CSRF protection
Production-ready deployment with Docker and Kubernetes support

Key Innovation: Progressive Subtitles

Unlike traditional subtitle systems, VideoCraft uses OpenAI Whisper to generate word-by-word timing for progressive subtitle animations. Each word appears precisely when spoken, creating engaging, TikTok-style subtitle effects.

"subtitles": {
  "style": "progressive",
  "settings": {
    "font_family": "Arial Black",
    "font_size": 32,
    "word_color": "#FFFFFF",
    "outline_color": "#000000"
  }
}

Quick Start

Prerequisites

Go 1.24+
FFmpeg (with libx264)
Python 3.8+ with pip
Docker (recommended)

1. Clone and Setup

git clone https://github.com/activadee/videocraft.git
cd videocraft

# Install Python dependencies for Whisper
pip install -r scripts/requirements.txt

# Build the application
make build

2. Configure Security (Required)

# Set allowed domains for CORS (required for web clients)
export VIDEOCRAFT_SECURITY_ALLOWED_DOMAINS="localhost:3000,yourdomain.com"

# Generate API key (or use auto-generated)
export VIDEOCRAFT_SECURITY_API_KEY="your-secure-api-key"

3. Start the Server

./videocraft
# Server starts on http://localhost:3002

4. Generate Your First Video

# Create a simple video configuration
cat > example.json << 'EOF'
[
  {
    "comment": "My first VideoCraft video",
    "resolution": "1920x1080",
    "quality": "medium",
    "scenes": [
      {
        "id": "intro",
        "background-color": "#1a1a1a",
        "elements": [
          {
            "type": "audio",
            "src": "https://example.com/your-audio.mp3"
          },
          {
            "type": "subtitles",
            "settings": {
              "style": "progressive",
              "font_family": "Arial Black",
              "font_size": 36,
              "word_color": "#FFFFFF",
              "outline_color": "#FF6B6B"
            }
          }
        ]
      }
    ]
  }
]
EOF

# Generate video
curl -X POST http://localhost:3002/api/v1/videos \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-secure-api-key" \
  -d @example.json

# Check job status (use job_id from response)
curl http://localhost:3002/api/v1/jobs/{job_id} \
  -H "Authorization: Bearer your-secure-api-key"

# Download completed video
curl http://localhost:3002/api/v1/videos/{video_id} \
  -H "Authorization: Bearer your-secure-api-key" \
  -o output.mp4

Configuration Format

VideoCraft uses an array-based JSON format supporting multiple projects:

[
  {
    "comment": "Project description",
    "resolution": "1920x1080",
    "quality": "high",
    "scenes": [
      {
        "id": "scene1",
        "background-color": "#000000",
        "elements": [
          {
            "type": "video",
            "src": "https://example.com/background.mp4",
            "volume": 0.3,
            "z-index": -1
          },
          {
            "type": "audio", 
            "src": "https://example.com/narration.mp3",
            "volume": 1.0
          },
          {
            "type": "image",
            "src": "https://example.com/logo.png",
            "x": 100,
            "y": 50,
            "z-index": 10
          },
          {
            "type": "subtitles",
            "settings": {
              "style": "progressive",
              "font_family": "Impact",
              "font_size": 48,
              "word_color": "#FFFF00",
              "outline_color": "#000000",
              "position": "center-bottom"
            }
          }
        ]
      }
    ],
    "elements": [
      // Global elements applied to all scenes
    ]
  }
]

Element Types

Video Elements

{
  "type": "video",
  "src": "https://example.com/video.mp4",
  "x": 0, "y": 0,
  "volume": 0.5,
  "z-index": 1
}

Audio Elements

{
  "type": "audio", 
  "src": "https://example.com/audio.mp3",
  "volume": 1.0,
  "duration": 30.5
}

Image Elements

{
  "type": "image",
  "src": "https://example.com/image.png", 
  "x": 100, "y": 200,
  "z-index": 5
}

Subtitle Elements

{
  "type": "subtitles",
  "settings": {
    "style": "progressive",        // progressive or classic
    "font_family": "Arial Black",
    "font_size": 32,
    "word_color": "#FFFFFF",
    "outline_color": "#000000",
    "position": "center-bottom"    // top, center, bottom
  }
}

API Reference

Authentication

All endpoints require Bearer token authentication:

Authorization: Bearer YOUR_API_KEY

Endpoints

Create Video Generation Job

POST /api/v1/videos
Content-Type: application/json

Body: JSON array of video configurations
Response: {"job_id": "uuid", "status": "pending"}

Get Job Status

GET /api/v1/jobs/{job_id}
Response: {
  "job_id": "uuid",
  "status": "completed|processing|failed|pending",
  "progress": 85,
  "video_id": "uuid",
  "error": "error message if failed"
}

Download Video

GET /api/v1/videos/{video_id}
Response: MP4 video file

Health Check

GET /health
Response: {"status": "healthy", "timestamp": "2024-01-01T12:00:00Z"}

CSRF Token (if CSRF enabled)

GET /api/v1/csrf-token  
Response: {"csrf_token": "secure-token"}

Security Features

VideoCraft implements comprehensive security measures:

CORS Protection

No wildcard origins - explicit domain allowlisting required
Secure credentials handling with proper origin validation
Request method restrictions to approved HTTP methods

# Required: Configure allowed domains
export VIDEOCRAFT_SECURITY_ALLOWED_DOMAINS="localhost:3000,yourdomain.com"

CSRF Protection

Token-based validation for state-changing requests
Secure token generation with cryptographic randomness
Optional but recommended for production environments

export VIDEOCRAFT_SECURITY_ENABLE_CSRF=true
export VIDEOCRAFT_SECURITY_CSRF_SECRET="your-secure-secret"

Input Validation

URL validation prevents SSRF attacks
File size limits prevent resource exhaustion
Media format validation ensures safe file processing
Command injection protection for FFmpeg operations

Error Handling

Sanitized error responses prevent information disclosure
Structured logging for security event monitoring
Rate limiting prevents abuse and DoS attacks

Architecture

graph TB
    Client[Web Client] --> API[HTTP API :3002]
    API --> Auth[Security Middleware]
    Auth --> Jobs[Job Queue]
    
    Jobs --> Audio[Audio Service]
    Jobs --> Whisper[Whisper Daemon]
    Jobs --> FFmpeg[FFmpeg Service]
    Jobs --> Storage[File Storage]
    
    Audio --> Probe[FFprobe Analysis]
    Whisper --> AI[OpenAI Whisper]
    FFmpeg --> Encoder[Video Encoder]
    Storage --> Files[Local Filesystem]

Core Components

HTTP API: Gin web framework with security middleware
Job Queue: Async processing with worker pools
Audio Service: Duration analysis and metadata extraction
Whisper Daemon: Persistent Python process for AI transcription
FFmpeg Service: Secure video composition and encoding
Storage Service: File management with cleanup policies

Configuration

VideoCraft supports comprehensive configuration via config.yaml:

server:
  host: "0.0.0.0"
  port: 3002

ffmpeg:
  binary_path: "ffmpeg"
  timeout: "1h"
  quality: 23        # CRF value (lower = better quality)
  preset: "medium"   # Encoding speed

transcription:
  enabled: true
  daemon:
    enabled: true
    idle_timeout: "300s"     # Shutdown after 5min idle
    startup_timeout: "120s"  # Max startup time
    restart_max_attempts: 3
  python:
    path: "python3"
    model: "base"           # tiny/base/small/medium/large
    language: "auto"        # Auto-detect or specific language
    device: "cpu"           # cpu/cuda

subtitles:
  enabled: true
  style: "progressive"       # progressive/classic
  font_family: "Arial"
  font_size: 24
  position: "center-bottom"
  colors:
    word: "#FFFFFF"
    outline: "#000000"

storage:
  output_dir: "./generated_videos"
  temp_dir: "./temp" 
  max_file_size: 1073741824  # 1GB limit
  retention_days: 7

job:
  workers: 4               # Concurrent job workers
  queue_size: 100         # Max queued jobs
  max_concurrent: 10      # Max concurrent jobs

security:
  rate_limit: 100         # Requests per minute
  enable_auth: true       # API key authentication
  api_key: ""            # Auto-generated if empty
  enable_csrf: false     # CSRF protection
  allowed_domains: []    # CORS allowed origins

Docker Deployment

Using Docker Compose (Recommended)

version: '3.8'
services:
  videocraft:
    build: .
    ports:
      - "3002:3002"
    environment:
      - VIDEOCRAFT_SECURITY_ALLOWED_DOMAINS=localhost:3000,yourdomain.com
      - VIDEOCRAFT_SECURITY_API_KEY=your-secure-api-key
    volumes:
      - ./generated_videos:/app/generated_videos
      - ./cache:/app/cache
    security_opt:
      - no-new-privileges:true
    user: "1000:1000"
    read_only: true
    tmpfs:
      - /tmp:size=1G,noexec,nosuid,nodev

# Start the service
docker-compose up -d

# Check logs
docker-compose logs -f videocraft

# Stop the service  
docker-compose down

Manual Docker Build

# Build image
docker build -t videocraft .

# Run container
docker run -d \
  --name videocraft \
  -p 3002:3002 \
  -e VIDEOCRAFT_SECURITY_ALLOWED_DOMAINS="localhost:3000" \
  -e VIDEOCRAFT_SECURITY_API_KEY="your-key" \
  -v $(pwd)/generated_videos:/app/generated_videos \
  videocraft

Development

Project Structure

videocraft/
├── cmd/videocraft/         # Application entry point
├── internal/
│   ├── api/                # HTTP handlers and middleware
│   ├── core/               # Business logic and services
│   ├── app/                # Configuration management  
│   ├── pkg/                # Shared utilities (logging, errors)
│   └── storage/            # File storage backend
├── scripts/                # Python Whisper daemon
├── config/                 # Configuration files
└── docs/                   # Technical documentation

Building & Testing

# Install dependencies
go mod download
pip install -r scripts/requirements.txt

# Development build
make build

# Run tests
make test

# Run with live reload (requires air)
make dev

# Security scan
make security

# Generate coverage report
make coverage

# Clean build artifacts
make clean

Environment Variables

Required for Web Clients:

export VIDEOCRAFT_SECURITY_ALLOWED_DOMAINS="localhost:3000,yourdomain.com"

Optional Security:

export VIDEOCRAFT_SECURITY_API_KEY="your-secure-api-key"
export VIDEOCRAFT_SECURITY_ENABLE_CSRF=true
export VIDEOCRAFT_SECURITY_CSRF_SECRET="your-csrf-secret"

Server Configuration:

export VIDEOCRAFT_SERVER_HOST="0.0.0.0"
export VIDEOCRAFT_SERVER_PORT=3002

Storage Configuration:

export VIDEOCRAFT_STORAGE_OUTPUT_DIR="./generated_videos"
export VIDEOCRAFT_STORAGE_TEMP_DIR="./temp"

Whisper Configuration:

export VIDEOCRAFT_TRANSCRIPTION_PYTHON_MODEL="base"
export VIDEOCRAFT_TRANSCRIPTION_PYTHON_DEVICE="cpu"

Troubleshooting

Common Issues

Server won't start

# Check port availability
lsof -i :3002

# Verify FFmpeg installation
ffmpeg -version

# Check Python dependencies
python -c "import whisper; print('Whisper OK')"

CORS errors in browser

# Add your domain to allowed list
export VIDEOCRAFT_SECURITY_ALLOWED_DOMAINS="localhost:3000,yourdomain.com"

# Check browser console for specific error
# Look for server logs: docker-compose logs videocraft | grep CORS

Whisper daemon fails

# Test Whisper manually
python scripts/whisper_daemon.py

# Check available models
python -c "import whisper; print(whisper.available_models())"

# For GPU support, install CUDA version of PyTorch
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu118

Video generation fails

# Verify media URLs are accessible
curl -I "https://your-media-url.com/audio.mp3"

# Check FFmpeg can process your media
ffprobe "https://your-media-url.com/audio.mp3"

# Monitor job status for detailed error messages
curl http://localhost:3002/api/v1/jobs/{job_id}

Out of memory errors

# Use smaller Whisper model
export VIDEOCRAFT_TRANSCRIPTION_PYTHON_MODEL="tiny"

# Reduce concurrent jobs
export VIDEOCRAFT_JOB_MAX_CONCURRENT=2

# Increase Docker memory limit
docker run --memory=4g videocraft

Debug Commands

# Test API connectivity
curl http://localhost:3002/health

# Get CSRF token
curl http://localhost:3002/api/v1/csrf-token

# Test authentication
curl -H "Authorization: Bearer your-key" http://localhost:3002/health

# Monitor logs
docker-compose logs -f videocraft | grep ERROR

Performance & Scaling

Resource Requirements

CPU: 2+ cores (FFmpeg encoding is CPU-intensive)
Memory: 4GB+ (Whisper models require significant RAM)
Storage: SSD recommended for video I/O
Network: High bandwidth for external media downloads

Optimization Tips

Use smaller Whisper models (tiny/base) for faster processing
Enable GPU acceleration if available (CUDA support)
Implement Redis for job queue in multi-instance deployments
Use CDN for frequently accessed media files
Configure FFmpeg presets based on quality vs speed requirements

Scaling Options

Horizontal scaling: Multiple VideoCraft instances behind load balancer
Dedicated workers: Separate transcription and video processing services
External storage: S3/MinIO for generated videos
Queue backend: Redis or RabbitMQ for job distribution

License

MIT License - see LICENSE file for details.

Contributing

Fork the repository
Create a feature branch (git checkout -b feature/new-feature)
Make your changes with tests
Run the full test suite (make test)
Submit a pull request

Development Guidelines

Follow Go best practices and idioms
Add unit tests for new functionality
Update documentation for API changes
Use conventional commit messages
Ensure security validations for user inputs

Built with Go, FFmpeg, and OpenAI Whisper

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.claude/commands		.claude/commands
.github		.github
.husky		.husky
cmd		cmd
config		config
docs		docs
internal		internal
scripts		scripts
.dockerignore		.dockerignore
.gitignore		.gitignore
.golangci.yml		.golangci.yml
.pre-commit-config.yaml		.pre-commit-config.yaml
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
bun.lock		bun.lock
codecov.yml		codecov.yml
docker-compose.yml		docker-compose.yml
go.mod		go.mod
go.sum		go.sum
package.json		package.json
refresh_token.py		refresh_token.py
test_new_format.json		test_new_format.json
test_small.json		test_small.json

License

activadee/videocraft

Folders and files

Latest commit

History

Repository files navigation

VideoCraft

What VideoCraft Does

Key Innovation: Progressive Subtitles

Quick Start

Prerequisites

1. Clone and Setup

2. Configure Security (Required)

3. Start the Server

4. Generate Your First Video

Configuration Format

Element Types

API Reference

Authentication

Endpoints

Security Features

CORS Protection

CSRF Protection

Input Validation

Error Handling

Architecture

Core Components

Configuration

Docker Deployment

Using Docker Compose (Recommended)

Manual Docker Build

Development

Project Structure

Building & Testing

Environment Variables

Troubleshooting

Common Issues

Debug Commands

Performance & Scaling

Resource Requirements

Optimization Tips

Scaling Options

License

Contributing

Development Guidelines

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages