Starred repositories
Dexbotic: Open-Source Vision-Language-Action Toolbox
Simulation verification and physical deployment of robot reinforcement learning algorithms, suitable for quadruped robots, wheeled robots, and humanoid robots. "sar" represents "simulation and real"
The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…
Code repo for the SIGGRAPH paper "Monocular Online Reconstruction with Enhanced Detail Preservation". Project page https//poiw.github.io/MODP/index.html
Uni-MoE: Lychee's Large Multimodal Model Family.
CUDA Templates and Python DSLs for High-Performance Linear Algebra
A unified inference and post-training framework for accelerated video generation.
Valdi is a cross-platform UI framework that delivers native performance without sacrificing developer velocity.
Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"
Visualizer for neural network, deep learning and machine learning models
Official implementation of UI-Ins: Enhancing GUI Grounding with Multi-Perspective Instruction-as-Reasoning
The absolute trainer to light up AI agents.
darktable is an open source photography workflow application and raw developer
A powerful cross-platform raw photo processing program
Python wrapper for the NVIDIA cuSFM library
[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy
Extending Linux support to enable Infinite-ISP on FPGA for the development of a libcamera-based camera application stack.
Infinite-ISP Tuning Tool is a console-based ISP (image signal processor) tuning application that is specifically designed to tune various modules in the Infinite-ISP_GM.
Official repository for BrickGPT, the first approach for generating physically stable toy brick models from text prompts.
Official PyTorch implementation for "Large Language Diffusion Models"
[arXiv 2025] ARMADA: Autonomous Online Failure Detection and Human Shared Control Empower Scalable Real-world Deployment and Adaptation
Official PyTorch Implementation of "Latent Diffusion Model Without Variational Autoencoder".
OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and language.
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.