Stars
Cosmos-RL is a flexible and scalable Reinforcement Learning framework specialized for Physical AI applications.
Ctrl-World: A Controllable Generative World Model for Robot Manipualtion
Cosmos-Curate is a powerful video curation system that processes, analyzes, and organizes video content using advanced AI models and distributed computing.
Cosmos-Predict2.5, the latest version of the Cosmos World Foundation Models (WFMs) family, specialized for simulating and predicting the future state of the world in the form of video.
Cosmos-Transfer2.5, built on top of Cosmos-Predict2.5, produces high-quality world simulations conditioned on multiple spatial control inputs.
(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
Official codebase for "Self Forcing: Bridging Training and Inference in Autoregressive Video Diffusion" (NeurIPS 2025 Spotlight)
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Portfolio Tracker: Track your investments and asset allocation
Official implementation of the paper: "FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models"
Cosmos-Predict2 is a collection of general-purpose world foundation models for Physical AI that can be fine-tuned into customized world models for downstream applications.
The official implementation of CVPR'25 Oral paper "Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise"
[CVPR 2025] Official code repository for "Pixel-level and Semantic-level Adjustable Super-resolution: A Dual-LoRA Approach"
[CVPR 2025 Highlight] Generative Photography: Scene-Consistent Camera Control for Realistic Text-to-Image Synthesis
[ICCV 2025] Official implementations for paper: VACE: All-in-One Video Creation and Editing
📹 A more flexible framework that can generate videos at any resolution and creates videos from images.
Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis
"Effective Whole-body Pose Estimation with Two-stages Distillation" (ICCV 2023, CV4Metaverse Workshop)
Wan: Open and Advanced Large-Scale Video Generative Models
Enjoy the magic of Diffusion models!
Lets make video diffusion practical!
Cosmos-Transfer1 is a world-to-world transfer model designed to bridge the perceptual divide between simulated and real-world environments.
[CVPR2025] A Physics-Informed Blur Learning Framework for Imaging Systems
Video Generation, Physical Commonsense, Semantic Adherence, VideoCon-Physics