- Follow instructions here to setup verl: https://verl.readthedocs.io/en/latest/README_vllm0.8.html
conda activate verlpip install seaborn
Our training and test data are stored on Hugging Face:
- Training data, stage 1: https://huggingface.co/datasets/CMU-AIRe/e3-math-easy
- Training data, stage 2: https://huggingface.co/datasets/CMU-AIRe/e3-math-medhard
- Test data: https://huggingface.co/datasets/CMU-AIRe/hmmt-aime-2025
To set up data for training,
- Create a local directory to store data
- Run
python examples/data_preprocess/math/generate_dataset.py --local_dir $dir --remote_dir $hf_dir --split $split - Ensure that
data.train_filesanddata.val_filesin your scripts (e.g.,scripts/grpo/grpo_16k.sh) point to the downloaded data
For eval you can run bash scripts/eval.sh