Stars
[ArXiv 25] Stable Video Infinity: Infinite-Length Video Generation with Error Recycling
LongLive: Real-time Interactive Long Video Generation
Code release for paper "Test-Time Training Done Right"
This is the official repository for the paper "FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Comprehensive Benchmark"
Official implementation of the paper "GenCompositor: Generative Video Compositing with Diffusion Transformer"
T2I-ReasonBench: Benchmarking Reasoning-Informed Text-to-Image Generation
Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer
[NeurIPS 2025] VIKI‑R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning
[ICCV 2025 ⭐highlight⭐] Implementation of VMem: Consistent Interactive Video Scene Generation with Surfel-Indexed View Memory
[ICCV 2025] Official implementation of the paper "DreamCube: 3D Panorama Generation via Multi-plane Synchronization".
[ARXIV’25] Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control
An official implementation of EvoSearch: Scaling Image and Video Generation via Test-Time Evolutionary Search
GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning
[SIGGRAPH Asia 2025] DreamO: A Unified Framework for Image Customization
SkyReels-V2: Infinite-length Film Generative model
[ICCV 2025] Official repo for "GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation"
MineWorld: A Real-time interactive world model on Minecraft
HoloPart: Generative 3D Part Amodal Segmentation
[ICCV 2025] RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints
(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
[ICLR 2025] Autoregressive Video Generation without Vector Quantization
[ICCV 2025] GameFactory: Creating New Games with Generative Interactive Videos
New repo collection for NVIDIA Cosmos: https://github.com/nvidia-cosmos