-
Zhejiang University
- Hangzhou, China
- https://hxy-123.github.io/
- @XingyiHe1
Stars
Fast and Universal 3D reconstruction model for versatile tasks
StreamingVLM: Real-Time Understanding for Infinite Video Streams
Automatically claims free games and DLCs on the Epic Games Store, Amazon Prime Gaming and GOG.
Official code for paper "InstantSfM: Fully Sparse and Parallel Structure-from-Motion"
Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"
Hunyuan3D-Omni: A Unified Framework for Controllable Generation of 3D Assets
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
A Modular Toolkit for Robot Kinematic Optimization
[NeurIPS 2025 (Spotlight)] The implementation for the paper "4DGT Learning a 4D Gaussian Transformer Using Real-World Monocular Videos"
Ray tracing and hybrid rasterization of Gaussian particles
Sim-to-real and CDM inference code for ManipAsInSim project.
Reference PyTorch implementation and models for DINOv3
ViPE: Video Pose Engine for Geometric 3D Perception
Code and dataset link for "DenseWorld-1M: Towards Detailed Dense Grounded Caption in the Real World"
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
[CORL 2025 Oral]One View, Many Worlds: Single-Image to 3D Object Meets Generative Domain Randomization for One-Shot 6D Pose Estimation.
Code for Streaming 4D Visual Geometry Transformer
Official Implementation of "Dens3R: A Foundation Model for 3D Geometry Prediction"
Code of π^3: Permutation-Equivariant Visual Geometry Learning
[ICCV 2025] Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models
Implementation of Prompting with the Future: Open-World Model Predictive Control with Interactive Digital Twins. [RSS 2025]
Official implementation of ICCV 2025 paper "EgoAgent: A Joint Predictive Agent Model in Egocentric Worlds".
[NeurIPS 2025 Spotlight] Towards Understanding Camera Motions in Any Video
Code release for paper "Test-Time Training Done Right"