Stars
[NeurIPS 2025] Official implementation of "XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation".
🔥 Official impl. of "DetailFlow: 1D Coarse-to-Fine Autoregressive Image Generation via Next-Detail Prediction"
A SOTA open-source image editing model, which aims to provide comparable performance against the closed-source models like GPT-4o and Gemini 2 Flash.
[CVPR 2025] 🔥 Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".
Rectified Flow Inversion (RF-Inversion) - ICLR 2025
Scalable and memory-optimized training of diffusion models
[ICLR 2025] Pyramidal Flow Matching for Efficient Video Generative Modeling
CogView4, CogView3-Plus and CogView3(ECCV 2024)
Some awesome comfyui workflows in here, and they are built using the comfyui-easy-use node package.
A general fine-tuning kit geared toward diffusion models.
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
A powerful anti-burn allowing much higher CFG scales for latent diffusion models (for ComfyUI)
GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
"Probabilistic Machine Learning" - a book series by Kevin Murphy
Understand Human Behavior to Align True Needs
Controllable video and image Generation, SVD, Animate Anyone, ControlNet, ControlNeXt, LoRA
[AAAI 2025]👔IMAGDressing👔: Interactive Modular Apparel Generation for Virtual Dressing. It enables customizable human image generation with flexible garment, pose, and scene control, ensuring high …
AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image Generation
⚡ Flash Diffusion ⚡: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation (AAAI 2025 Oral)
Official implementation of EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
EDM2 and Autoguidance -- Official PyTorch implementation
Unofficial Implementation of Animate Anyone by Novita AI
MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone