Stars
Code for "TD-MPC2: Scalable, Robust World Models for Continuous Control"
🙃 A delightful community-driven (with 2,400+ contributors) framework for managing your zsh configuration. Includes 300+ optional plugins (rails, git, macOS, hub, docker, homebrew, node, php, python…
A simple state update rule to enhance length generalization for CUT3R
ViPE: Video Pose Engine for Geometric 3D Perception
Code for ICCV'2025 (Best student paper honorable mention) "RayZer: A Self-supervised Large View Synthesis Model"
Universal Monocular Metric Depth Estimation
[CVPR'25 Oral] MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision
Generalizable Perception Stack for all things 3D, 4D & Scene Understanding
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
SpatialVID: A Large-Scale Video Dataset with Spatial Annotations
BEHAVIOR-1K: a platform for accelerating Embodied AI research. Join our Discord for support: https://discord.gg/bccR5vGFEx
[ICLR 2025 Oral] Official code for "LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias"
An ML research template with good documentation by Boyuan Chen, an MIT PhD student
Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
Useful information for getting started in the lab
Official implementation of Continuous 3D Perception Model with Persistent State
Awesome-LLM: a curated list of Large Language Model
A topic-centric list of HQ open datasets.
😎 Awesome lists about all kinds of interesting topics
freeCodeCamp.org's open-source codebase and curriculum. Learn math, programming, and computer science for free.
MEt3R: Measuring Multi-View Consistency in Generated Images
Fast and differentiable MS-SSIM and SSIM for pytorch.
You can easily calculate FVD, PSNR, SSIM, LPIPS for evaluating the quality of generated or predicted videos.
ICCV 2025 | TesserAct: Learning 4D Embodied World Models
[CVPR2024 Highlight] VBench - We Evaluate Video Generation