-
HKU, ZJU, HIT
- Hong Kong SAR
-
18:50
(UTC +08:00) - shawlyu.github.io
- https://orcid.org/0000-0003-1318-4905
Highlights
- Pro
Stars
ViPE: Video Pose Engine for Geometric 3D Perception
PyTorch implementation of JiT https://arxiv.org/abs/2511.13720
An open-source, GPU-accelerated physics simulation engine built upon NVIDIA Warp, specifically targeting roboticists and simulation researchers.
SpatialVID: A Large-Scale Video Dataset with Spatial Annotations
Official implementation of "Force Prompting: Video Generation Models Can Learn and Generalize Physics-based Control Signals" (NeurIPS 2025)
[ICCV 2025] PhysTwin: Physics-Informed Reconstruction and Simulation of Deformable Objects from Videos
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
PyTorch code for hierarchical k-means -- a data curation method for self-supervised learning
Cosmos-Reason1 models understand the physical common sense and generate appropriate embodied decisions in natural language through long chain-of-thought reasoning processes.
ICCV 2025 | TesserAct: Learning 4D Embodied World Models
[ICLR 2025 Oral] Official code for "LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias"
Code for ICCV'2025 (Best student paper honorable mention) "RayZer: A Self-supervised Large View Synthesis Model"
Unified framework for robot learning built on NVIDIA Isaac Sim
PyTorch code and models for VJEPA2 self-supervised learning from video.
A collection of useful functions for 3D vision & graphics research in Python.
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Lets make video diffusion practical!
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
Mobile manipulation research tools for roboticists
[CVPR 2025 Best Paper Nomination] FoundationStereo: Zero-Shot Stereo Matching
[ICCV 2023] MatrixCity: A Large-scale City Dataset for City-scale Neural Rendering and Beyond.
Python optical flow visualization following Baker et al. (ICCV 2007) as used by the MPI-Sintel challenge
Official implementation of Continuous 3D Perception Model with Persistent State
[CVPR 2025 Highlight] Align3R: Aligned Monocular Depth Estimation for Dynamic Videos
Official implementation of ICCV2023 VideoFlow: Exploiting Temporal Cues for Multi-frame Optical Flow Estimation
Official code for "FeatUp: A Model-Agnostic Frameworkfor Features at Any Resolution" ICLR 2024
[CVPR'25 Oral] MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision
Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding