-
Tsinghua University
- https://steven-xzr.github.io/
Highlights
- Pro
Stars
PyTorch based Probabilistic Time Series forecasting framework based on GluonTS backend
A Library for Advanced Deep Time Series Models for General Time Series Analysis.
Muon is an optimizer for hidden layers in neural networks
A tale of works on the complexity of first-order bilevel optimization.
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Re-implementation of pi0 vision-language-action (VLA) model from Physical Intelligence
moojink / openvla-oft
Forked from openvla/openvlaFine-Tuning Vision-Language-Action Models: Optimizing Speed and Success
Code for the project "MegaSaM: Accurate, Fast and Robust Structure and Motion from Casual Dynamic Videos"
Code repository for the Habitat Synthetic Scenes Dataset (HSSD) paper.
The Replica Dataset v1 as published in https://arxiv.org/abs/1906.05797 .
[CVPR 2024 Highlight] FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects
[RSS25] Official implementation of DemoGen: Synthetic Demonstration Generation for Data-Efficient Visuomotor Policy Learning
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
Python efficient farthest point sampling (FPS) library. Compatible with numpy.
Theia: Distilling Diverse Vision Foundation Models for Robot Learning
Universal Manipulation Interface: In-The-Wild Robot Teaching Without In-The-Wild Robots
Implementation of UltraPixel: Advancing Ultra-High-Resolution Image Synthesis to New Peaks
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
[CVPR 2024 Oral] Official repository for RALF: Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation
Official repository for "PosterLayout: A New Benchmark and Approach for Content-aware Visual-Textual Presentation Layout" (CVPR 2023).
[ECCV2022] Gen6D: Generalizable Model-Free 6-DoF Object Pose Estimation from RGB Images
Realize inverse kinematics in mujoco through mocap control
[RSS 2024] 3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations