Stars
GigaTrain: An Efficient and Scalable Training Framework for AI Models
A generative world for general-purpose robotics & embodied AI learning.
[CSUR] A Survey on Video Diffusion Models
[ECCV`24&ICLR`25] CityGaussian Series for High-quality Large-Scale Scene Reconstruction with Gaussians
[CVPR'25 Highlight] You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale
Official Pytorch Implementation of Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models
[CVPR 2025] Official implementation of "AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models"
Official implementation of "MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling"
Official implementation of `Splatter Image: Ultra-Fast Single-View 3D Reconstruction' CVPR 2024
[ICCV'25]DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion
Open-Sora: Democratizing Efficient Video Production for All
The best OSS video generation models, created by Genmo
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)
deep learning for image processing including classification and object-detection etc.
Large World Model -- Modeling Text and Video with Millions Context
Multi-Target Multi-Camera Human Tracking (Non-overlapping camera system)
深蓝学院 多传感器定位融合第四期 学习笔记
LeGO-LOAM, LIO-SAM, LVI-SAM, FAST-LIO2, Faster-LIO, VoxelMap, R3LIVE, Point-LIO, KISS-ICP, DLO, DLIO, Ada-LIO, PV-LIO, SLAMesh, ImMesh, FAST-LIO-MULTI, M-LOAM, LOCUS, SLICT, MA-LIO, CT-ICP, GenZ-IC…
A general framework for map-based visual localization. It contains 1) Map Generation which support traditional features or deeplearning features. 2) Hierarchical-Localizationvisual in visual(points…
Xtreme1 is an all-in-one data labeling and annotation platform for multimodal data training and supports 3D LiDAR point cloud, image, and LLM.
A curated list of tutorials, papers, software related to multi-view stereo.
Open source Structure-from-Motion pipeline
Real-time Dense Point Cloud, Digital Surface Map (DSM) and (Ortho-)Mosaic Generation for UAVs