Lists (1)
Sort Name ascending (A-Z)
Stars
Official implementation of "OpenCity3D: What do Vision-Language Models know about Urban Environments?" @ WACV2025
[CVPR 2025] Code for Segment Any Motion in Videos
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.
assistant tools for attention visualization in deep learning
[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
The official implementation of the paper “VGGT4D: Mining Motion Cues in Visual Geometry Transformers for 4D Scene Reconstruction.”
Code of π^3: Permutation-Equivariant Visual Geometry Learning
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
[CVPR 2025 Highlight] Video Depth Anything: Consistent Depth Estimation for Super-Long Videos
The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…
CUDA accelerated rasterization of gaussian splatting
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
COLMAP - Structure-from-Motion and Multi-View Stereo
[NeurIPS 2025] Official code for Reconstruct, Inpaint, Test-Time Finetune: Dynamic Novel-view Synthesis from Monocular Videos
A data generation pipeline for creating semi-realistic synthetic multi-object videos with rich annotations such as instance segmentation masks, depth maps, and optical flow.
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
[CVPR 2024] 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
Hackable and optimized Transformers building blocks, supporting a composable construction.
Wan: Open and Advanced Large-Scale Video Generative Models