Skip to content

RogerQi/human-policy

Repository files navigation

Humanoid Policy ~ Human Policy

           

Introduction

This repository contains the codebase for the paper "Humanoid Policy ~ Human Policy".

It trains egocentric (i.e., without wrist camera) humanoid manipulation policies, with few wrappers to focus on the core components.

Repo Structure

| - assets: robot URDFs and meshes
| - cet: Mujoco simulation for replaying/rollouting out policies for code development and adding new embodiments.
| - configs: configs for robots and simulation environments
| - data: placeholder for data with some visualization scripts
| - docs: documentations
| - hdt: main learning framework
| - human_data: script and interface for collecting human demonstration data
| - sim_test (legacy): legacy ALOHA cube transferring test as a dummy example for sanity check

Supported Algorithms

  • ACT (with options to use ResNet, DinoV2, and CLIP backbones)
  • Vanilla DP (based on official colab)
  • RDT (Trainer works, but not tested)

Setup dependency

Clone the codebase.

cd ~
git clone --recursive https://github.com/RogerQi/human-policy

Follow INSTALL.md to install required dependencies.

Download open-sourced data

We open-source recordings on HuggingFace:

  • Many humans performing tasks described in the paper in diverse in-the-wild scenes.
  • Two Unitree H1 humanoid robots physically located in UCSD and CMU. Collected via teleoperation.
  • One Unitree H1 humanoid robot in Mujoco. Collected via teleoperation.

To download the data, run

cd data/recordings
bash download_data.sh

Visualize downloaded data

We provide scripts to examine the actions and visual observations in downloaded data.

cd data/
# You can take a look at the argparse inside the script to change data path for visualization
python plot_keypoints.py
python plot_visual_obs.py

Converting your own data to our human-centric representation format

Human data is a scalable source for manipulation policy learning and we believe humanoid policies should make good use of it. To process your own human/humanoid data to our format, please refer to these files:

Training

Let's take the toy humanoid manipulation data in Mujoco and simple ACT policy with ResNet as an example. Other data/model options are available in model configs and dataset configs.

To launch simple training on a single GPU (with at least 24GB VRAM), run

python main.py --chunk_size 100 --batch_size 64 --num_epochs 50000 --lr 1e-4 --seed 0 --exptid 'mujoco_sim_test_resnet_100cs' --dataset_json_path configs/datasets/mujoco_sim.json  --model_cfg_path configs/models/act_resnet.yaml --no_wandb

For more sophisticated training, such as BF16, torch.compile, or multi-GPU training, the training script supports huggingface accelerator.

Start by configuring a config.yaml file.

accelerate config --config_file ./accelerator_setup.yaml

Then

accelerate launch --config_file ./accelerator_setup.yaml  main.py --chunk_size 100 --batch_size 64 --num_epochs 50000 --lr 1e-4 --seed 0 --exptid 'mujoco_sim_test_resnet_100cs' --dataset_json_path configs/datasets/mujoco_sim.json  --model_cfg_path configs/models/act_resnet.yaml --no_wandb

For policies without complex architectures, such as ACT, we recommend using the val_and_jit_trace option to create traced models.

accelerate launch --config_file ./accelerator_setup.yaml  main.py --chunk_size 100 --batch_size 64 --num_epochs 50000 --lr 1e-4 --seed 0 --exptid 'mujoco_sim_test_resnet_100cs' --dataset_json_path configs/datasets/mujoco_sim.json  --model_cfg_path configs/models/act_resnet.yaml --no_wandb --val_and_jit_trace

By default, the main trainer uses accelerator.load_state to resume model/optimizer states from the latest checkpoint specified in (exptid). The val_and_jit_trace flag then skips the training loop, and uses data format in the training data loader to create a traced model in (exptid)/policy_traced.pt.

(Virtual) Policy Rollout

Continuing from the previous example, after the policy is trained and traced, we can rollout the policy in Mujoco/real robot. The needed components are the dataset statistics and the traced policy weights. We include an example command below, for complete details please refer to docs/humanoid_mujoco.md.

cd ../cet
python mujoco_rollout_replay.py  --hdf_file_path ../data/recordings/processed/1061new_sim_pepsi_grasp_h1_2_inspire-2025_02_11-22_20_48/processed_episode_0.hdf5 --norm_stats_path ../hdt/mujoco_sim_test_resnet_100cs/dataset_stats.pkl  --plot --model_path ../hdt/mujoco_sim_test_resnet_100cs/policy_traced.pt  --tasktype pepsi --chunk_size 100 --policy_config_path ../hdt/configs/models/act_resnet.yaml

Human Data Collection Guide

After setting up the ZED camera mount following our hardware documentation, you can start collecting human data.

Step 1: Install Dependencies

First, initialize and update the opentv submodule:

git submodule update --init --recursive

Then follow the README to complete the environment setup.


Step 2: Collect Human Data

Run the following command to start the data collection process:

cd ./human_data
python human_data.py --des task_name --description "description of the task"
  • --des: A short name for the task (e.g., pouring, cutting)
  • --description: A more detailed description of the task

Gesture-Based Control

We use simple hand gestures to control the data collection flow:

  • Record Gesture: Start and stop recording a demonstration.
  • Drop Gesture: Cancel the current recording.

Recording Pipeline

The following diagram shows the internal state transitions during the data collection process:

  1. Use the Record Gesture to enter and exit the RECORDING state.
  2. Use the Drop Gesture to cancel the current gesture and return to WAITING.

Step 3: Post-process Human Data

Run the following command to start the data collection process:

cd ./cet
python post_process_zed.py --taskid task_name --multiprocess

TODOs

  • Add teleoperation scripts to collect more Mujoco data
  • Alleviate the known 'sticky finger' friction issue in Mujoco sim rollout
  • Add example for forward kinematrics / retargeting for a new humanoid

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages