Stars
FastTracker: Real-Time and Accurate Visual Tracking
[EMNLP 2025 Oral] MemoryOS is designed to provide a memory operating system for personalized AI agents.
基于多智能体LLM的中文金融交易框架 - TradingAgents中文增强版
An optimized pipeline for DINet reducing inference latency for up to 60% 🚀. Kudos for the authors of the original repo for this amazing work.
基于 Playwright 和AI过滤的闲鱼多任务实时/定时监控与智能分析工具,配备了功能完善的后台管理界面。帮助用户节省闲鱼商品过滤,能及时找到心仪商品。
Text-audio foundation model from Boson AI
[CVPR 2025] Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformer
[SIGGRAPH 2025] Official code of the paper "FlexiAct: Towards Flexible Action Control in Heterogeneous Scenarios"
An AI-powered task-management system you can drop into Cursor, Lovable, Windsurf, Roo, and others.
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
KeySync: A Robust Approach for Leakage-free Lip Synchronization in High Resolution
Real time interactive streaming digital human
[CVPR'25] InsTaG: Learning Personalized 3D Talking Head from Few-Second Video
Official repo for FaceShot: Bring Any Character into Life
State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!
hyfevian / LiveTalking
Forked from lipku/LiveTalkingReal time interactive streaming digital human
ICCV 2025 ACTalker: an end-to-end video diffusion framework for talking head synthesis that supports both single and multi-signal control (e.g., audio, expression).
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
The "virtual_human_stream" project is a real-time digital human system supporting audio-video dialogue. It integrates models like ernerf, musetalk, and wav2lip for voice cloning, video stitching, a…
fay是一个帮助数字人(2.5d、3d、移动、pc、网页)或大语言模型(openai兼容、deepseek)连通业务系统的mcp框架。
[ACM MM 2025] FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
[CVPR 2025] EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation