Lists (5)
Sort Name ascending (A-Z)
Stars
Official implementation of ATI: Any Trajectory Instruction for Controllable Video Generation. https://arxiv.org/pdf/2505.22944
Official implementation of EPiC: Efficient Video Camera Control Learning with Precise Anchor-Video Guidance
[ICLR'25] SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints
[ICCV'25 Best Paper Finalist] ReCamMaster: Camera-Controlled Generative Rendering from A Single Video
Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation [Siggraph Asian 2025]
GoatWu / Self-Forcing-Plus
Forked from guandeh17/Self-ForcingUnofficial extension implementation of Self-Forcing to support I2V && 14B training.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
ModelTC / Wan2.2-Lightning
Forked from Wan-Video/Wan2.2Wan2.2-Lightning: Speed up wan2.2 model with distillation
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference
Official codebase for "Self Forcing: Bridging Training and Inference in Autoregressive Video Diffusion" (NeurIPS 2025 Spotlight)
[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
【三年面试五年模拟】AIGC算法工程师面试秘籍。涵盖AIGC、传统深度学习、自动驾驶、AI Agent、机器学习、计算机视觉、自然语言处理、强化学习、大数据挖掘、具身智能、元宇宙、AGI等AI行业面试笔试干货经验与核心知识。
📚 AIGC 求职面经、必备基础知识、提示词工程、ChatGPT、Stable Diffusion、Prompt、Embedding、Fintune 等 AIGC 求职你所需要知道的一切~
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
FantasyTalking2: Timestep-Layer Adaptive Preference Optimization for Audio-Driven Portrait Animation
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Video-R1: Reinforcing Video Reasoning in MLLMs [🔥the first paper to explore R1 for video]
A pipeline parallel training script for diffusion models.
Phantom: Subject-Consistent Video Generation via Cross-Modal Alignment
Wan: Open and Advanced Large-Scale Video Generative Models
[Siggraph2025] The official code of the paper "ColorSurge: Bringing Vibrancy and Efficiency to Automatic Video Colorization via Dual-Branch Fusion"
SVCNet: Scribble-based Video Colorization Network with Temporal Aggregation. IEEE TIP, 2023
[SIGGRAPH2025] Official repo for paper "Any-length Video Inpainting and Editing with Plug-and-Play Context Control"
Code and data for "AnyV2V: A Tuning-Free Framework For Any Video-to-Video Editing Tasks" [TMLR 2024]
Tiny-DeepSpeed, a minimalistic re-implementation of the DeepSpeed library