Zican Hu12, Wei Liu3, Xiaoye Qu2, Xiangyu Yue4, Chunlin Chen1, Zhi Wang12✉, Yu Cheng4✉
1Nanjing University 2Shanghai AI Laboratory 3The Hong Kong University of Science and Technology 4The Chinese University of Hong Kong
If you find our paper useful, please consider to star this repository and cite it:
@inproceedings{hu2024divide,
title={Divide and Conquer: Grounding LLMs as Efficient Decision-Making Agents via Offline Hierarchical Reinforcement Learning},
author={Zican Hu and Wei Liu and Xiaoye Qu and Xiangyu Yue and Chuniln Chen and Zhi Wang and Yu Cheng},
year={2025},
booktitle={Proceedings of the 42st International Conference on Machine Learning}
}GLIDER tested on two benchmark tasks ScienceWorld and AlfWorld. Follow the instructions in the [ScienceWorld ][AlfWorld] to install.
Create a virtual environment using conda, and see requirments.txt file for more information about how to install the dependencies.
conda create -n glider python=3.10 -y
conda activate glider
pip install -r requirements.txtRun SFT training with the following script with corresponding config in ./config/glider_bc.json:
srun -p PARTITION_NAME \ # Specify your partition
-w NODE_NAME \ # specify worker node
-c NUM_CPUS \ # Specify CPU constraints
deepspeed --num_gpus NUM_GPUS --master_port=PORT_NUMBER train_glider_bc.pyOr simply run the shell file:
sh glider_bc.shSet collection data mode in ./config/collection.json
srun -p PARTITION_NAME \
-w NODE_NAME\
-c NUM_CPUS \
deepspeed --num_gpus 1 --master_port=PORT_NUMBER glider_data_collection.pyThen run ORL training with the following script with corresponding config in ./config/glider_awac.json:
sh glider_awac.sh Set task name and check point path in ./config/glider_o2o.json , then run O2O training :
srun -p PARTITION_NAME \ # Specify your partition
-w NODE_NAME \ # specify worker node
-c NUM_CPUS \ # Specify CPU constraints
deepspeed --num_gpus NUM_GPUS --master_port=PORT_NUMBER train_glider_online.pySet evaluation setting in ./config/eval.json , then run the shell file:
sh eval.sh