Bridging Synthetic and Real Routing Problems via LLM-Guided Instance Generation and Progressive Adaptation
This repository contains code for an productive framework for evolution of generators for finetuning Neural Combinatorial Solvers and an efficient progessive fine-tuning approach to enhance the generalization of them on the real-world benchmark dataset for TSP (Traveling Salesman Problem) and CVRP (Capacitated Vehicle Routing Problem), namely TSPLib (Reinelt, 1991) and CVRPLib (Uchoa et al. 2017).\
Our paper has been accepted by the proceedings of AAAI-2026 main technical track (acceptance rate: 17%).
We encourage the interested readers to view and test our codes.
This code is ran on a machine with Python version 3.12.4 .
SEE requirement.txt for all required packages. This project's structure is clear, the codes are based on .py files, and they should be easy to read, understand, and run.
All paths below assume you are in the root directory EvoReal_main/.
For reproducing generators aligned with real-world distributions (e.g. TSPLib, CVRPLib). Run the following code with desired configs:
Important Requirements To implement the EvoReal, download checkpoints from github repositories of POMO and LEHD: For POMO: https://github.com/yd-kwon/POMO/tree/master/NEW_py_ver (checkpoint-2000.pt for TSP, and checkpoint-30500.pt for CVRP) For LEHD: https://github.com/CIAM-Group/NCO_code/tree/main/single_objective/LEHD/TSP (checkpoint-150.pt for TSP, and checkpoint-40.pt for CVRP)
and place them in the following folders:
/EvoReal_main/TSP/POMO/result/saved_tsp100_model /EvoReal_main/TSP/LEHD/result/20230509_153705_train
/EvoReal_main/CVRP/POMO/result/pretrained_cvrp100_model /EvoReal_main/CVRP/LEHD/result/20230817_235537_train
AND Install non-learning-based solvers- Concorde and HGS in your device first before running evolution.
For installing HGS solver, use pip install hygese in your local environment.
For installing CONCORDE solver, refer to https://blog.csdn.net/u011412840/article/details/122276492 or https://www.researchgate.net/publication/324485167_Concorde_solver_installation_and_use
You can export your key using any of the following methods or directly copy and paste it in the "openai.yaml" in the '/cfg/llm_client' of the corresponding problem folder (e.g. /EvoReal_main/TSP/cfg/llm_client/openai.yaml). Use the command below in the bash to set your OpenAI API key:
export OPENAI_API_KEY="sk-xxxxxxxxxxxxxxxxxxxxxxxx"and set all other configs and hyperparameters in the config.yaml file in '/cfg' in the corresponding problem folder (e.g. /EvoReal_main/TSP/cfg/config.yaml).
#for TSP problems:
cd EvoReal_main/TSP
python main.py#for CVRP problems:
cd EvoReal_main/CVRP
python main.pyYou can also change the default model to other OpenAI models (e.g. o3, gpt-4o) in the "openai.yaml" file.
For POMO: After generating the elitist (best) generators, place the three types of the best-found elitist generator in the following files and run the "train_p1.py": for TSP: \POMO-EvoReal-2phase_finetune\TSP\POMO_TSP_p1\evolved_gen\gpt_S1.py \POMO-EvoReal-2phase_finetune\TSP\POMO_TSP_p1\evolved_gen\gpt_S2.py \POMO-EvoReal-2phase_finetune\TSP\POMO_TSP_p1\evolved_gen\gpt_S3.py for CVRP: \POMO-EvoReal-2phase_finetune\CVRP\POMO_CVRP_p1\POMO\generate_instance_p1.py
For LEHD: The training samples are generated by the evolved best-found generator after evolution and labelled with strong heuristic solvers (Concorde or HGS) with a larger numbers of training samples than those used in generator evolution (e.g. generating 50 thousands problems instead of 10 thousand in evolution).
To fine-tune pre-trained models in phase one, i.e., LEHD-EvoReal and POMO-EvoReal, firstly download the official checkpoints from POMO/LEHD's github. For POMO: https://github.com/yd-kwon/POMO/tree/master/NEW_py_ver (checkpoint-2000.pt for TSP, and checkpoint-30500.pt for CVRP) For LEHD: https://github.com/CIAM-Group/NCO_code/tree/main/single_objective/LEHD/TSP (checkpoint-150.pt for TSP, and checkpoint-40.pt for CVRP)
Once downloaded, please run train_p1.py in each sub-folders for TSP and CVRP:
# For POMO
python /POMO-EvoReal-2phase_finetune/TSP/POMO_TSP_p1/train_p1.py
python /POMO-EvoReal-2phase_finetune/CVRP/POMO_CVRP_p1/train_p1.py
# For LEHD
python /LEHD-EvoReal-2phase_finetune/TSP/LEHD_TSP_p1/TSP/train_p1.py
python /LEHD-EvoReal-2phase_finetune/CVRP/LEHD_CVRP_p1/CVRP/train_p1.pyNo need to download checkpoints, saved checkpoints trained from phase one finetuning are in the folder ./result/p1-checkpoints of each problem. Please run train_p2.py in each sub-folders for TSP and CVRP:
# For POMO
python POMO-EvoReal-2phase_finetune/TSP/POMO_TSP_p2/train_p2.py
python POMO-EvoReal-2phase_finetune/CVRP/POMO_CVRP_p2/train_p2.py
# For LEHD
python LEHD-EvoReal-2phase_finetune/TSP/LEHD_TSP_p2/TSP/train_p2.py
python LEHD-EvoReal-2phase_finetune/CVRP/LEHD_CVRP_p2/CVRP/train_p2.pyModify parameters in the script as needed. Ablation studies in table 3 and table 4 can be reproduced by switching the corresponding training phase in train_xxx.py. No separate code is required beyond the provided training scripts. All checkpoints (trained from phase one+two) and training logs and statistics can be found be in the /EXP_checkpoints folder
To evaluate trained models and reproduce main table results, run test_tsplib.py and test_cvrplib.py in each sub-folders TSP and CVRP:
# For TSP
python POMO-EvoReal-2phase_finetune/TSP/test_tsplib.py
python POMO-EvoReal-2phase_finetune/CVRP/test_cvrplib.py
# For CVRP
python LEHD-EvoReal-2phase_finetune/TSP/test_tsplib.py
python LEHD-EvoReal-2phase_finetune/CVRP/test_cvrplib.pyAll evaluation results and statistics will be saved in JSON format.
EvoReal_main
TSP/
CVRP/
LEHD-EvoReal-2phase_finetune/
CVRP/
TSP/
POMO-EvoReal-2phase_finetune/
TSP/
CVRP/
EXP_checkpoints/
LEHD_CVRP_EvoReal_checkpoints/ # LEHD-CVRP model checkpoints and training logs
LEHD_TSP_EvoReal_checkpoints/ # LEHD-TSP model checkpoints and training logs
POMO_CVRP_EvoReal_checkpoints/ # POMO-CVRP model checkpoints and training logs
POMO_TSP_EvoReal_checkpoints/ # POMO-TSP model checkpoints and training logs
test_logs/ # All inference logs and statistics
requirements.txt
README.md
NOTICE
Important
One of the training dataset cannot be uploaded to github due to the size limit. Please download the dataset from: https://drive.google.com/drive/folders/1xlkZ_EmkC8YLE8OqSRmvQXSR13qxTz0P?usp=sharing and place the file in this folder: /LEHD-EvoReal/TSP/LEHD_TSP_p1/TSP/data
The code can only be used for non-commercial purposes. Please contact the authors if you want to use this code for business matters. If this repository is helpful for your research, please cite our paper:
@misc{zhu2025bridgingsyntheticrealrouting,
title={Bridging Synthetic and Real Routing Problems via LLM-Guided Instance Generation and Progressive Adaptation},
author={Jianghan Zhu and Yaoxin Wu and Zhuoyi Lin and Zhengyuan Zhang and Haiyan Yin and Zhiguang Cao and Senthilnath Jayavelu and Xiaoli Li},
year={2025},
eprint={2511.10233},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2511.10233},
}