Stars
Imitation learning policy training and inference framework for paper: Real-to-Sim Robot Policy Evaluation with Gaussian Splatting Simulation of Soft-Body Interactions
Open-source code of the paper: Real-to-Sim Robot Policy Evaluation with Gaussian Splatting Simulation of Soft-Body Interactions.
Connect AI models like Claude & GPT with robots using MCP and ROS.
This is a curated list of "Embodied AI or robot with Large Language Models" research. Watch this repository for the latest updates! 🔥
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
NVIDIA Isaac GR00T N1.5 - A Foundation Model for Generalist Robots.
Nvidia GEAR Lab's initiative to solve the robotics data problem using world models
Mastering Diverse Domains through World Models
[ICML 2024] 3D-VLA: A 3D Vision-Language-Action Generative World Model
Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.
✨✨Latest Advances on Multimodal Large Language Models
[RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusion
A living list of important industry innovators in the Robotics and AI space
moojink / openvla-oft
Forked from openvla/openvlaFine-Tuning Vision-Language-Action Models: Optimizing Speed and Success
A list of awesome and popular robot learning environments
DelinQu / SimplerEnv-OpenVLA
Forked from simpler-env/SimplerEnvEvaluating and reproducing real-world robot manipulation policies (e.g., RT-1, RT-1-X, Octo, and OpenVLA) in simulation under common setups (e.g., Google Robot, WidowX+Bridge)
Paper list in the survey paper: Toward General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis
Benchmarking Knowledge Transfer in Lifelong Robot Learning
openvla / openvla
Forked from TRI-ML/prismatic-vlmsOpenVLA: An open-source vision-language-action model for robotic manipulation.
A curated list of state-of-the-art research in embodied AI, focusing on vision-language-action (VLA) models, vision-language navigation (VLN), and related multimodal learning approaches.
Latest Advances on Embodied Multimodal LLMs (or Vison-Language-Action Models).
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
200+ detailed flashcards useful for reviewing topics in machine learning, computer vision, and computer science.
A helpful 5-page machine learning cheatsheet to assist with exam reviews, interview prep, and anything in-between.