Stars
Official implementation for "PHUMA: Physically-Grounded Humanoid Locomotion Dataset"
[Neurips DB 2025] PartNeXt: A Next-Generation Dataset for Fine-Grained and Hierarchical 3D Part Understanding
NORA: A Small Open-Sourced Generalist Vision Language Action Model for Embodied Tasks
starVLA: A Lego-like Codebase for Vision-Language-Action Model Developing
Isaac Lab API, powered by MuJoCo-Warp, for RL and robotics research.
Official implementation of OpenTrack.
InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy
A Paper List for Humanoid Robot Learning.
A curated list of behavior(al) foundation model (BFM) papers, articles, tutorials, slides and projects
[ICCV 2025] Towards More Diverse and Challenging Pre-training for Point Cloud Learning: Self-Supervised Cross Reconstruction with Decoupled Views
[NeurIPS 2024 Spotlight] PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders
本仓库包含对 Claude Code v1.0.33 进行逆向工程的完整研究和分析资料。包括对混淆源代码的深度技术分析、系统架构文档,以及重构 Claude Code agent 系统的实现蓝图。主要发现包括实时 Steering 机制、多 Agent 架构、智能上下文管理和工具执行管道。该项目为理解现代 AI agent 系统设计和实现提供技术参考。
verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"
Official code for "Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation"
X-SAM: From Segment Anything to Any Segmentation (AAAI2026)
A collection of paper/projects that trains flow matching model/policies via RL.
Reference PyTorch implementation and models for DINOv3
Official Repo of TexVerse: A Universe of 3D Objects with High-Resolution Textures
A high-throughput and memory-efficient inference and serving engine for LLMs
verl: Volcano Engine Reinforcement Learning for LLMs
Being-H0: Vision-Language-Action Pretraining from Large-Scale Human Videos
This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels with Hunyuan3D World Model