-
Sichuan University
- Chengdu
Stars
Phantom: Subject-Consistent Video Generation via Cross-Modal Alignment
Wan: Open and Advanced Large-Scale Video Generative Models
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
An inference and training framework for multiple image input in Flux Kontext dev
[CVPR'25 Oral] MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision
[CAD/Graphics 2025][Computers & Graphics] Navigating Large-Pose Challenge for High-Fidelity Face Reenactment with Video Diffusion Model
Code of π^3: Permutation-Equivariant Visual Geometry Learning
Implementation of "EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer"(ICCV2025)
[NeurIPS 2025] Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
Image inpainting tool powered by SOTA AI Model. Remove any unwanted object, defect, people from your pictures or erase and replace(powered by stable diffusion) any thing on your pictures.
Instant voice cloning by MIT and MyShell. Audio foundation model.
[NeurIPS 2025] Image editing is worth a single LoRA! 0.1% training data for fantastic image editing! Surpasses GPT-4o in ID persistence~ MoE ckpt released! Only 4GB VRAM is enough to run!
[ICCV 2025] Official implementations for paper: VACE: All-in-One Video Creation and Editing
[ICLR 2025] ControlAR: Controllable Image Generation with Autoregressive Models
[ICCV'25 Best Paper Finalist] ReCamMaster: Camera-Controlled Generative Rendering from A Single Video
Lets make video diffusion practical!
UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
Pytorch Implementation of "Stable-Makeup: When Real-World Makeup Transfer Meets Diffusion Model" (SIGGRAPH 2025)
💬 An extensive collection of exceptional resources dedicated to the captivating world of talking face synthesis! ⭐ If you find this repo useful, please give it a star! 🤩
Wan: Open and Advanced Large-Scale Video Generative Models
Enjoy the magic of Diffusion models!
[CVPR 2025] High-Fidelity Relightable Monocular Portrait Animation with Lighting-Controllable Video Diffusion Model
Pippo: High-Resolution Multi-View Humans from a Single Image
Official pytorch implementation of paper "High-quality Animatable Eyelid Shapes from Lightweight Captures" (SIGGRAPH Asia 2024).