AirCut 🖐️

Real-time Air Gesture Recognition Desktop Application

AirCut is a modern desktop application that enables intuitive air gesture recognition using computer vision and machine learning. Draw gestures in the air with your hand and execute custom commands through advanced gesture recognition algorithms.

🌟 Features

✨ Core Functionality

Real-time Hand Tracking - Advanced computer vision using Roboflow inference
Air Gesture Recording - Capture complex gesture trajectories in 3D space
Template-based Recognition - Create and manage custom gesture templates
Command Execution - Link gestures to system commands or actions
Silent Operation - Clean, distraction-free user experience

🚀 Performance & UX

High-Performance Streaming - Optimized 25 FPS processing with 240x180 inference resolution
Local Processing - No cloud dependencies, complete privacy
Real-time Feedback - Instant visual feedback during gesture recording
Modern UI - Beautiful dark/light mode interface with smooth animations
Cross-Platform - Built with Tauri for native desktop performance

🛠️ Technical Features

Dynamic Time Warping (DTW) - Advanced gesture matching algorithm
Confidence Thresholds - Adjustable detection and recognition sensitivity
Stateless Architecture - Efficient client-server communication
WebSocket Communication - Real-time bidirectional data exchange
Frame-by-Frame Processing - Optimized video processing pipeline

🏗️ Architecture

AirCut follows a modern client-server architecture designed for performance and maintainability:

┌─────────────────────────────────────────────────────────────┐
│                     AirCut Desktop App                     │
│                        (Tauri)                             │
├─────────────────────────────────────────────────────────────┤
│  Frontend (React + TypeScript)                             │
│  ├── Video Capture & Streaming                             │
│  ├── Gesture Visualization                                 │
│  ├── Template Management (Client-side Storage)             │
│  └── Real-time UI Updates                                  │
├─────────────────────────────────────────────────────────────┤
│  Backend Communication                                      │
│  ├── WebSocket Camera Stream (/ws/frames)                   │
│  ├── WebSocket Gesture Recognition (/ws/gestures)           │
│  └── REST API Health Checks                                │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                   Backend Server                           │
│                   (Python FastAPI)                         │
├─────────────────────────────────────────────────────────────┤
│  Frame Processing Pipeline                                  │
│  ├── Frame Decoding & Preprocessing                        │
│  ├── Roboflow ML Inference                                 │
│  ├── Hand Detection                                 │
│  └── Coordinate Normalization                              │
├─────────────────────────────────────────────────────────────┤
│  Gesture Recognition Engine                                │
│  ├── Trajectory Processing                                 │
│  ├── DTW-based Template Matching                           │
│  ├── Confidence Scoring                                    │
│  └── Stateless Recognition Service                         │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│               Roboflow Inference API                       │
│                 (Computer Vision ML)                       │
├─────────────────────────────────────────────────────────────┤
│  Hand Detection Model                                       │
│  ├── Real-time Hand Tracking                             │
│  ├── Bounding Box Detection                                │
│  ├── Confidence Scoring                                    │
│  └── Coordinate Extraction                                 │
└─────────────────────────────────────────────────────────────┘

🔄 Data Flow

Video Capture: Frontend captures webcam frames at 30 FPS
Camera Stream: Compressed frames sent to backend via Camera Stream WebSocket (/ws/frames)
ML Inference: Backend processes frames using Roboflow API (25 FPS)
Detection Results: Hand coordinates sent back to frontend via Camera Stream WebSocket
Gesture Recording: Frontend tracks hand movements when drawing
Recognition: Recorded trajectories compared against stored templates via Gesture WebSocket (/ws/gestures)
Command Execution: Matched gestures trigger associated commands

🛠️ Technology Stack

Frontend (Desktop Application)

Tauri - Rust-based desktop app framework
React 18 - Modern UI library with hooks
TypeScript - Type-safe JavaScript
Tailwind CSS - Utility-first CSS framework
Zustand - Lightweight state management
Sonner - Toast notifications (minimal usage)

Backend (Server)

FastAPI - High-performance Python web framework
WebSockets - Real-time bidirectional communication
OpenCV - Computer vision and image processing
NumPy - Numerical computing for trajectory processing
Uvicorn - ASGI server for production performance

Machine Learning & AI

Roboflow - Computer vision platform and inference API
Inference - Robofflow inference python sdk to run the model in python proccessor
Dynamic Time Warping (DTW) - Custom gesture matching algorithm
Trajectory Normalization - Mathematical gesture standardization

Development & Build Tools

Vite - Fast build tool and dev server
Rust - Systems programming for Tauri
Python 3.8+ - Backend runtime environment

📋 Prerequisites

Before installing AirCut, ensure you have the following:

System Requirements

Operating System: Windows 10+, macOS 10.15+, or Linux
RAM: Minimum 4GB, recommended 8GB+
Storage: 500MB free space
Webcam: Built-in or external USB webcam
Internet: Required for initial setup and ML model downloads

Development Prerequisites

Rust 1.70+ (for Tauri)
Node.js 18+ and npm
Python 3.8+
Git for version control

API Keys

Roboflow Account - Free account at roboflow.com
API Key - Generate from your Roboflow dashboard

🚀 Quick Start

1. Clone Repository

git clone https://github.com/furkanksl/aircut.git
cd aircut

2. Install Dependencies

Frontend Dependencies

npm install

Backend Dependencies

cd backend
python3.11 -m venv venv

# Activate virtual environment
source venv/bin/activate  # macOS/Linux
# or
.\venv\Scripts\activate   # Windows

pip install -r requirements.txt

3. Configure Environment

Create environment configuration:

# Create backend/.env file
cat > backend/.env << EOF
ROBOFLOW_API_KEY=your_roboflow_api_key_here
ROBOFLOW_MODEL_ID=handdetection-qycc7/1
CONFIDENCE_THRESHOLD=0.2
EOF

4. Start Development Servers

Terminal 1: Backend Server

cd backend
source venv/bin/activate  # or .\venv\Scripts\activate on Windows
python3.11 main.py

Terminal 2: Frontend App

npm run tauri dev

5. First Run Setup

Allow camera permissions when prompted
Wait for backend connection (green indicator)
Try recording your first gesture!

📖 Detailed Installation

Backend Setup

Create Virtual Environment

cd backend
python3.11 -m venv venv
source venv/bin/activate  # macOS/Linux
.\venv\Scripts\activate   # Windows

Install Python Dependencies

pip install --upgrade pip
pip install -r requirements.txt
pip install inference

Verify Installation

python3.11 -c "import cv2, fastapi, inference; print('All dependencies installed successfully!')"

Frontend Setup

Install Node Dependencies
```
npm install
```
Install Tauri CLI
```
npm install -g @tauri-apps/cli
```
Verify Rust Installation
```
rustc --version
cargo --version
```

Environment Configuration

Backend Environment Variables (.env)

# Required
ROBOFLOW_API_KEY=your_api_key_here
ROBOFLOW_MODEL_ID=handdetection-qycc7/1

# Optional (with defaults)
CONFIDENCE_THRESHOLD=0.2
SERVER_HOST=127.0.0.1
SERVER_PORT=8000

Getting Your Roboflow API Key

Visit roboflow.com and create a free account
Go to your account settings
Navigate to the "API" section
Copy your API key
Paste it in your .env file

🎮 Usage Guide

Basic Workflow

Start the Application
- Launch both backend and frontend servers
- Ensure camera permissions are granted
- Wait for green connection indicator
Record a Gesture
- Click the blue "Play" button or wait for auto-start
- Move your hand in the air to draw a gesture
- The app tracks your movement with visual feedback
- Click "Stop" or remove your hand to finish recording
Save Template
- Click "Save" after recording a gesture
- Enter a descriptive name (e.g., "Circle", "Wave")
- Optionally add a command (e.g., "open browser")
- Click "Save Template"
Recognize Gestures
- Record a new gesture
- The app automatically compares it to your saved templates
- See recognition results with confidence scores
- Execute associated commands

Advanced Features

Confidence Adjustment

Hand Detection: Adjust sensitivity for hand detection
Gesture Recognition: Set threshold for template matching
Real-time updates without restart required

Template Management

View all saved gestures in the Library panel
Delete unwanted templates
Templates persist between sessions
Export/import capabilities (planned)

Performance Monitoring

Real-time FPS display
Connection status indicators
Frame processing statistics
Detection confidence tracking

Tips for Best Results

Recording Quality Gestures

Use good lighting conditions
Keep hand clearly visible to camera
Draw gestures at moderate speed
Make distinct, repeatable movements
Avoid background clutter

Recognition Accuracy

Create templates with consistent movements
Use unique gesture shapes
Avoid overly similar gestures
Adjust confidence thresholds if needed
Record multiple templates for variations

🔧 Configuration

Performance Settings

Frame Processing (Backend)

# In main.py
DETECTION_FPS = 25  # Inference frequency
INFERENCE_SIZE = 240  # Resolution for ML processing
JPEG_QUALITY = 0.75  # Stream compression
SKIP_FRAMES = 2  # Process every 3rd frame

Recognition Sensitivity

# Default confidence thresholds
HAND_DETECTION_CONFIDENCE = 0.5  # Hand detection sensitivity
GESTURE_RECOGNITION_CONFIDENCE = 0.6  # Template matching threshold
AUTO_START_DELAY = 1.0  # Seconds before auto-start

UI Customization

Theme Configuration

Light/Dark mode toggle in top bar
Automatic system theme detection
Persistent theme preferences

Visual Settings

// Adjustable in UI
trajectoryColor: string; // Gesture trail color
boundingBoxStyle: object; // Hand detection visualization
confidenceDisplay: boolean; // Show confidence scores

🔍 API Reference

WebSocket Endpoints

`/ws/frames` - Camera Stream

Purpose: Real-time camera frame processing and hand detection

Client → Server Messages:

{
  "type": "frame",
  "frame": "data:image/jpeg;base64,/9j/4AAQ..."
}

{
  "type": "update_confidence",
  "hand_detection_confidence": 0.5,
  "gesture_recognition_confidence": 0.6
}

Server → Client Messages:

{
  "type": "detection",
  "detection": {
    "x": 320.5,
    "y": 240.3,
    "width": 80.2,
    "height": 75.8,
    "confidence": 0.87,
    "class": "hand"
  },
  "timestamp": 1699123456.789
}

{
  "type": "connection_established",
  "message": "Camera stream ready",
  "current_hand_confidence": 0.5,
  "current_gesture_confidence": 0.6
}

`/ws/gestures` - Gesture Recognition

Purpose: Template management and gesture recognition

Client → Server Messages:

{
  "type": "recognize_gesture",
  "trajectory": [
    {"x": 0.1, "y": 0.2},
    {"x": 0.15, "y": 0.25},
    ...
  ],
  "confidence_threshold": 0.6,
  "templates": [
    {
      "name": "Circle",
      "command": "open browser",
      "trajectory": [...]
    }
  ]
}

{
  "type": "start_tracking",
  "message": "Begin hand tracking"
}

{
  "type": "stop_tracking",
  "message": "Stop hand tracking"
}

Server → Client Messages:

{
  "type": "gesture_recognized",
  "template_name": "Circle",
  "similarity": 0.85,
  "command": "open browser"
}

{
  "type": "gesture_not_recognized",
  "message": "No matching gesture found"
}

REST Endpoints

`GET /health`

Purpose: Health check and system status

Response:

{
  "status": "healthy",
  "camera_active": true,
  "tracking_enabled": true,
  "frame_count": 1234,
  "detection_count": 567,
  "inference_available": true,
  "model_loaded": true
}

`GET /video_feed`

Purpose: MJPEG video stream (legacy, not used in current version)

Response: Multipart MJPEG stream

🏗️ Development

Project Structure

aircut/
├── desktop/                    # Tauri desktop application
│   ├── src/                   # React TypeScript source
│   │   ├── components/        # React components
│   │   │   ├── VideoFeed.tsx  # Camera and visualization
│   │   │   └── ui/           # UI component library
│   │   ├── services/         # Business logic
│   │   │   └── frameStreamService.ts
│   │   ├── stores/           # State management
│   │   │   └── appStore.ts   # Zustand store
│   │   ├── hooks/            # Custom React hooks
│   │   └── App.tsx           # Main application
│   ├── src-tauri/            # Rust backend for Tauri
│   │   ├── src/main.rs       # Tauri main process
│   │   └── Cargo.toml        # Rust dependencies
│   ├── package.json          # Node.js dependencies
│   └── tauri.conf.json       # Tauri configuration
├── backend/                   # Python FastAPI server
│   ├── main.py               # Main server application
│   ├── requirements.txt      # Python dependencies
│   └── .env                  # Environment variables
├── README.md                 # This documentation
└── .gitignore               # Git ignore rules

Development Commands

Frontend Development

# Start development server
npm run tauri dev

# Build for production
npm run tauri build

# Run tests
npm run test

# Lint code
npm run lint

# Format code
npm run format

Backend Development

# Start development server with auto-reload
python3.11 main.py

# Run with uvicorn directly
uvicorn main:app --host 127.0.0.1 --port 8000 --reload

Adding New Features

Creating New Gesture Templates

Record gesture using the UI
Templates are stored in browser localStorage
Access via useAppStore().templates

Extending Recognition Algorithm

Modify SimpleGestureRecognizer class in main.py
Implement new similarity calculation methods
Add configuration options for new parameters

Adding UI Components

Create new component in desktop/src/components/
Follow existing component patterns
Use Tailwind CSS for styling
Integrate with Zustand store for state

Code Style Guidelines

TypeScript/React

Use functional components with hooks
Implement proper TypeScript types
Follow React best practices
Use Tailwind for styling

Python

Follow PEP 8 style guide
Use type hints where appropriate
Implement proper error handling
Write docstrings for functions

🐛 Troubleshooting

Common Issues

🔴 Camera Not Working

Symptoms: Black screen, "Camera not available" message Solutions:

Check camera permissions in system settings
Close other applications using the camera
Try different camera index (if multiple cameras)
Restart the application

🔴 Backend Connection Failed

Symptoms: Red connection indicator, WebSocket errors Solutions:

Verify backend server is running on port 8000
Check that both WebSocket endpoints (/ws/frames and /ws/gestures) are available
Check firewall settings
Ensure no other service is using port 8000
Restart backend server

🔴 No Hand Detection

Symptoms: No bounding boxes around hands Solutions:

Verify Roboflow API key is correct
Check that the Camera Stream WebSocket (/ws/frames) is connected
Check internet connection for API calls
Improve lighting conditions
Lower hand detection confidence threshold
Ensure hand is clearly visible to camera

🔴 Poor Recognition Accuracy

Symptoms: Gestures not recognized or wrong matches Solutions:

Verify that the Gesture WebSocket (/ws/gestures) is connected
Record more distinct gesture templates
Adjust gesture recognition confidence
Ensure consistent gesture recording
Avoid overly similar gesture shapes
Re-record templates with better technique

🔴 Performance Issues

Symptoms: Low FPS, lag, high CPU usage Solutions:

Close unnecessary applications
Reduce video resolution in camera settings
Increase frame skip rate in configuration
Use GPU-accelerated inference if available
Check system resource usage

Debug Information

Logging Levels

# Backend logging configuration
logging.basicConfig(level=logging.INFO)  # Change to DEBUG for verbose output

Performance Monitoring

Frame processing statistics in /health endpoint
Real-time FPS display in UI
WebSocket connection status indicators
Detection confidence tracking

Browser Developer Tools

Open browser developer tools (F12)
Check Console tab for JavaScript errors
Monitor Network tab for WebSocket connections
Use Performance tab for profiling

Getting Help

Community Support

GitHub Issues: Report bugs and request features
Discussions: Ask questions and share tips
Wiki: Extended documentation and tutorials

Development Support

Code review guidelines in CONTRIBUTING.md
Development setup instructions
API documentation and examples

🚀 Building for Production

Frontend Build

# Build Tauri application
npm run tauri build

# Output locations:
# - Windows: target/release/bundle/msi/
# - macOS: target/release/bundle/dmg/
# - Linux: target/release/bundle/deb/ or target/release/bundle/rpm/

Backend Deployment

Option 1: Standalone Python

# Install production dependencies
pip install -r requirements.txt

# Run with production server
uvicorn main:app --host 0.0.0.0 --port 8000 --workers 1

Option 2: Docker Container

FROM python:3.11

WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .
EXPOSE 8000

CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

Distribution

Desktop App Installer

Tauri automatically generates platform-specific installers
Code signing certificates recommended for distribution
App store submission guidelines available

System Requirements Document

Include with distribution:

Minimum system requirements
Installation instructions
Camera permission setup
Roboflow API setup guide

🤝 Contributing

We welcome contributions to AirCut! Here's how to get started:

Development Setup

Fork the repository
Clone your fork locally
Follow the installation instructions
Create a feature branch
Make your changes
Test thoroughly
Submit a pull request

Contribution Guidelines

Follow existing code style and patterns
Add tests for new functionality
Update documentation as needed
Use clear commit messages
Ensure all tests pass

Areas for Contribution

New gesture recognition algorithms
Additional UI features and improvements
Performance optimizations
Platform-specific enhancements
Documentation improvements
Bug fixes and testing

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Roboflow - Computer vision platform and API
Tauri Team - Desktop application framework
OpenCV - Computer vision library
FastAPI - High-performance web framework
React Team - UI library and ecosystem

📧 Contact

Project Repository: GitHub
Issues & Bug Reports: GitHub Issues
Feature Requests: GitHub Discussions

Built with ❤️ for intuitive human-computer interaction

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
backend		backend
desktop		desktop
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
setup.sh		setup.sh
start-backend.sh		start-backend.sh
start-dev.sh		start-dev.sh
start-frontend.sh		start-frontend.sh

furkanksl/aircut

Folders and files

Latest commit

History

Repository files navigation

AirCut 🖐️

🌟 Features

✨ Core Functionality

🚀 Performance & UX

🛠️ Technical Features

🏗️ Architecture

🔄 Data Flow

🛠️ Technology Stack

Frontend (Desktop Application)

Backend (Server)

Machine Learning & AI

Development & Build Tools

📋 Prerequisites

System Requirements

Development Prerequisites

API Keys

🚀 Quick Start

1. Clone Repository

2. Install Dependencies

Frontend Dependencies

Backend Dependencies

3. Configure Environment

4. Start Development Servers

Terminal 1: Backend Server

Terminal 2: Frontend App

5. First Run Setup

📖 Detailed Installation

Backend Setup

Frontend Setup

Environment Configuration

Backend Environment Variables (.env)

Getting Your Roboflow API Key

🎮 Usage Guide

Basic Workflow

Advanced Features

Confidence Adjustment

Template Management

Performance Monitoring

Tips for Best Results

Recording Quality Gestures

Recognition Accuracy

🔧 Configuration

Performance Settings

Frame Processing (Backend)

Recognition Sensitivity

UI Customization

Theme Configuration

Visual Settings

🔍 API Reference

WebSocket Endpoints

/ws/frames - Camera Stream

/ws/gestures - Gesture Recognition

REST Endpoints

GET /health

GET /video_feed

🏗️ Development

Project Structure

Development Commands

Frontend Development

Backend Development

Adding New Features

Creating New Gesture Templates

Extending Recognition Algorithm

Adding UI Components

Code Style Guidelines

TypeScript/React

Python

🐛 Troubleshooting

Common Issues

🔴 Camera Not Working

🔴 Backend Connection Failed

🔴 No Hand Detection

🔴 Poor Recognition Accuracy

🔴 Performance Issues

Debug Information

`/ws/frames` - Camera Stream

`/ws/gestures` - Gesture Recognition

`GET /health`

`GET /video_feed`

Packages