Starred repositories
deepbeepmeep / Wan2GP
Forked from Wan-Video/Wan2.1A fast AI Video Generator for the GPU Poor. Supports Wan 2.1/2.2, Qwen Image, Hunyuan Video, LTX Video and Flux.
Official Python inference and LoRA trainer package for the LTX-2 audio–video generative model.
Light Image Video Generation Inference Framework
Wan: Open and Advanced Large-Scale Video Generative Models
[AAAI 2026] EchoMimicV3: 1.3B Parameters are All You Need for Unified Multi-Modal and Multi-Task Human Animation
SoulX-FlashTalk is the first 14B model to achieve sub-second start-up latency (0.87s) while maintaining a real-time throughput of 32 FPS on an 8xH800 node.
BISHENG is an open LLM devops platform for next generation Enterprise AI applications. Powerful and comprehensive features include: GenAI workflow, RAG, Agent, Unified model management, Evaluation,…
Fast and Universal 3D reconstruction model for versatile tasks
A command line toolkit to generate maps, point clouds, 3D models and DEMs from drone, balloon or kite images. 📷
A refactored codebase for Gaussian Splatting. Training 3DGS in 50 seconds!
CUDA accelerated rasterization of gaussian splatting
✨ An advanced 3D Gaussian Splatting renderer for THREE.js
Sharp Monocular View Synthesis in Less Than a Second
MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba. Full multimodal LLM Android App:[MNN-LLM-Android](./apps/Android/MnnLlmChat/READ…
Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
An Open Phone Agent Model & Framework. Unlocking the AI Phone for Everyone
🔥🔥🔥Java免费离线AI算法工具箱,支持人脸识别,活体检测,表情识别、目标检测、实例分割、行人检测、OCR文字识别、车牌识别、表格识别、ASR+TTS、机器翻译等功能,Maven引用即可使用。支持PyTorch、Tensorflow,已集成 Mtcnn、InsightFace、SeetaFace6、YOLOv8~v12、PaddleOCR(PPOCRv5)、Whisper等主流模型
Leading free and open-source face recognition system
a open framework for blind navigation based on esp32
Official inference repo for FLUX.2 models
Unlimited-length talking video generation that supports image-to-video and video-to-video generation
Enterprise-grade, commercial-friendly agentic workflow platform for building next-generation SuperAgents.
[ACM MM 2025] Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis
微舆:人人可用的多Agent舆情分析助手,打破信息茧房,还原舆情原貌,预测未来走向,辅助决策!从0实现,不依赖任何框架。
A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python
🗂️A file list/WebDAV program that supports multiple storages, powered by Gin and Solidjs. / 一个支持多存储的文件列表/WebDAV程序,使用 Gin 和 Solidjs。