zongmianli

zong zongmianli

Embodied AI Researcher

31 followers · 31 following

@willowsierra
Paris
00:09 (UTC +08:00)

Achievements

Highlights

Stars

facebookresearch / egoman

The repository provides code for EgoMAN model and dataset creation scripts.

Python 10 Updated Dec 31, 2025

OpenGalaxea / GalaxeaManipSim

Simulation of manipulation tasks using Galaxea robots

Python 27 Updated Aug 18, 2025

InternRobotics / F1-VLA

F1: A Vision Language Action Model Bridging Understanding and Generation to Actions

Python 153 10 Updated Jan 2, 2026

InternRobotics / InternVLA-M1

InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy

Python 330 19 Updated Jan 4, 2026

X-Square-Robot / wall-x

Building General-Purpose Robots Based on Embodied Foundation Model

Python 652 46 Updated Dec 10, 2025

DelinQu / SimplerEnv-OpenVLA

Forked from simpler-env/SimplerEnv

Evaluating and reproducing real-world robot manipulation policies (e.g., RT-1, RT-1-X, Octo, and OpenVLA) in simulation under common setups (e.g., Google Robot, WidowX+Bridge)

Jupyter Notebook 256 43 Updated Jun 23, 2025

Zhoues / RoboRefer

[NeurIPS 2025] Official implementation of "RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics"

Python 218 7 Updated Dec 16, 2025

FudanCVL / Awesome-Image-Editing

A Survey of Image Editing

458 12 Updated Aug 24, 2025

nv-tlabs / vipe

ViPE: Video Pose Engine for Geometric 3D Perception

Python 1,602 125 Updated Jan 1, 2026

Wan-Video / Wan2.1

Wan: Open and Advanced Large-Scale Video Generative Models

Python 15,065 2,256 Updated Dec 15, 2025

QwenLM / Qwen-Image

Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.

Python 6,877 396 Updated Dec 31, 2025

FlagOpen / RoboBrain2.0

RoboBrain 2.0: Advanced version of RoboBrain. See Better. Think Harder. Do Smarter. 🎉🎉🎉

Python 742 63 Updated Dec 16, 2025

baaivision / UniVLA

Unified Vision-Language-Action Model

Python 257 20 Updated Oct 15, 2025

practicalli / doom-emacs-config

Practicalli customisations to the Doom Emacs configuration

Emacs Lisp 18 6 Updated Apr 12, 2025

Max-Fu / otter

[ICML 2025] OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction

Python 114 9 Updated Apr 14, 2025

QuanyiLi / pi0-text-latent

Official implementation of the paper: Task Reconstruction and Extrapolation for $\pi_0$ using Text Latent (https://arxiv.org/pdf/2505.03500)

Jupyter Notebook 98 2 Updated Aug 3, 2025

CXU-TRI / FAIL-Detect

Code for RSS 2025 paper "Can We Detect Failures Without Failure Data? Uncertainty-Aware Runtime Failure Detection for Imitation Learning Policies"

Python 27 5 Updated Jun 18, 2025

chungmin99 / pyroki

A Modular Toolkit for Robot Kinematic Optimization

Python 1,301 135 Updated Jan 6, 2026

priorMDM / priorMDM

The official implementation of the paper "Human Motion Diffusion as a Generative Prior"

Python 508 26 Updated Jan 25, 2025

KimHanjung / UniSkill

[CoRL 2025] UniSkill: Imitating Human Videos via Cross-Embodiment Skill Representations

Python 72 2 Updated Dec 18, 2025

OpenDriveLab / UniVLA

[RSS 2025] Learning to Act Anywhere with Task-centric Latent Actions

Python 925 54 Updated Nov 19, 2025

NVIDIA / Isaac-GR00T

NVIDIA Isaac GR00T N1.6 - A Foundation Model for Generalist Robots.

Jupyter Notebook 5,816 922 Updated Dec 18, 2025

2toinf / UniAct

[CVPR 2025] The offical Implementation of "Universal Actions for Enhanced Embodied Foundation Models"

Python 224 11 Updated Nov 6, 2025

ActiveVisionLab / Awesome-LLM-3D

Awesome-LLM-3D: a curated list of Multi-modal Large Language Model in 3D world Resources

2,088 130 Updated Oct 27, 2025

datawhalechina / self-llm

《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调（全参数/Lora）、部署国内外开源大模型（LLM）/多模态大模型（MLLM）教程

Jupyter Notebook 27,237 2,722 Updated Jan 7, 2026

CleanDiffuserTeam / CleanDiffuser

CleanDiffuser: An Easy-to-use Modularized Library for Diffusion Models in Decision Making

Jupyter Notebook 681 65 Updated Apr 20, 2025

facebookresearch / vggt

[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer

Python 12,152 1,287 Updated Oct 11, 2025

Tavish9 / any4lerobot

🎁 A collection of utilities for LeRobot.

Python 778 66 Updated Jan 5, 2026

OpenDriveLab / RoboDual

RoboDual: Dual-System for Robotic Manipulation

Python 103 6 Updated Jul 2, 2025

QwenLM / Qwen2.5-Omni

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 3,866 308 Updated Jun 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

zong zongmianli

Achievements

Achievements

Highlights

Block or report zongmianli

Stars

facebookresearch / egoman

OpenGalaxea / GalaxeaManipSim

InternRobotics / F1-VLA

InternRobotics / InternVLA-M1

X-Square-Robot / wall-x

DelinQu / SimplerEnv-OpenVLA

Zhoues / RoboRefer

FudanCVL / Awesome-Image-Editing

nv-tlabs / vipe

Wan-Video / Wan2.1

QwenLM / Qwen-Image

FlagOpen / RoboBrain2.0

baaivision / UniVLA

practicalli / doom-emacs-config

Max-Fu / otter

QuanyiLi / pi0-text-latent

CXU-TRI / FAIL-Detect

chungmin99 / pyroki

priorMDM / priorMDM

KimHanjung / UniSkill

OpenDriveLab / UniVLA

NVIDIA / Isaac-GR00T

2toinf / UniAct

ActiveVisionLab / Awesome-LLM-3D

datawhalechina / self-llm

CleanDiffuserTeam / CleanDiffuser

facebookresearch / vggt

Tavish9 / any4lerobot

OpenDriveLab / RoboDual

QwenLM / Qwen2.5-Omni