We offer a suite of rerankers - pointwise models like monoT5 and listwise models with a focus on open source LLMs compatible with FastChat (e.g., Vicuna, Zephyr, etc.), vLLM or SGLang. We also support RankGPT variants, which are proprietary listwise rerankers. Some of the code in this repository is borrowed from RankGPT, PyGaggle, and LiT5!
current_version = 0.20.2
Note for Mac Users: RankLLM is not compatible with Apple Silicon (M1/M2) chips. However, you can still run it by using the Intel-based version of Anaconda and launching your terminal through Rosetta 2.
conda create -n rankllm python=3.10
conda activate rankllmpip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121pip3 install torch torchvision torchaudioconda install -c conda-forge openjdk=21 maven -ypip install -r requirements.txtIf building nmslib failed during installation, try manually installing the library with conda install -c conda-forge nmslib and following it up with pip install -r requirements.txt again.
pip install rank-llm[vllm]  # pip installation
pip install -e .[vllm]      # or local installation for developmentpip install rank-llm[sglang]  # pip installation
pip install -e .[sglang]      # or local installation for developmentRemember to install flashinfer to use SGLang backend.
pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/We can run the RankZephyr model with the following command:
python src/rank_llm/scripts/run_rank_llm.py  --model_path=castorini/rank_zephyr_7b_v1_full --top_k_candidates=100 --dataset=dl20 \
--retrieval_method=SPLADE++_EnsembleDistil_ONNX --prompt_mode=rank_GPT  --context_size=4096 --variable_passagesIncluding the --vllm_batched flag will allow you to run the model in batched mode using the vLLM library.
Including the --sglang_batched flag will allow you to run the model in batched mode using the SGLang library.
If you want to run multiple passes of the model, you can use the --num_passes flag.
We can run the RankGPT4-o model with the following command:
python src/rank_llm/scripts/run_rank_llm.py  --model_path=gpt-4o --top_k_candidates=100 --dataset=dl20 \
  --retrieval_method=bm25 --prompt_mode=rank_GPT_APEER  --context_size=4096 --use_azure_openaiNote that the --prompt_mode is set to rank_GPT_APEER to use the LLM refined prompt from APEER.
This can be changed to rank_GPT to use the original prompt.
We can run the LiT5-Distill V2 model (which could rerank 100 documents in a single pass) with the following command:
python src/rank_llm/scripts/run_rank_llm.py  --model_path=castorini/LiT5-Distill-large-v2 --top_k_candidates=100 --dataset=dl19 \
    --retrieval_method=bm25 --prompt_mode=LiT5  --context_size=150 --vllm_batched --batch_size=4 \
    --variable_passages --window_size=100We can run the LiT5-Distill original model (which works with a window size of 20) with the following command:
python src/rank_llm/scripts/run_rank_llm.py  --model_path=castorini/LiT5-Distill-large --top_k_candidates=100 --dataset=dl19 \
    --retrieval_method=bm25 --prompt_mode=LiT5  --context_size=150 --vllm_batched --batch_size=32 \
    --variable_passagesWe can run the LiT5-Score model with the following command:
python src/rank_llm/scripts/run_rank_llm.py  --model_path=castorini/LiT5-Score-large --top_k_candidates=100 --dataset=dl19 \
    --retrieval_method=bm25 --prompt_mode=LiT5 --context_size=150 --vllm_batched --batch_size=8 \
    --window_size=100 --variable_passagesThe following runs the 3B variant of monoT5 trained for 10K steps:
python src/rank_llm/scripts/run_rank_llm.py --model_path=castorini/monot5-3b-msmarco-10k --top_k_candidates=1000 --dataset=dl19 \
  --retrieval_method=bm25 --prompt_mode=monot5 --context_size=512
Note that we usually rerank 1K candidates with monoT5.
If you would like to contribute to the project, please refer to the contribution guidelines.
The following is a table of the listwise models our repository was primarily built to handle (with the models hosted on HuggingFace):
vLLM and SGLang backends are only supported for RankZephyr and RankVicuna models.
| Model Name | Hugging Face Identifier/Link | 
|---|---|
| RankZephyr 7B V1 - Full - BF16 | castorini/rank_zephyr_7b_v1_full | 
| RankVicuna 7B - V1 | castorini/rank_vicuna_7b_v1 | 
| RankVicuna 7B - V1 - No Data Augmentation | castorini/rank_vicuna_7b_v1_noda | 
| RankVicuna 7B - V1 - FP16 | castorini/rank_vicuna_7b_v1_fp16 | 
| RankVicuna 7B - V1 - No Data Augmentation - FP16 | castorini/rank_vicuna_7b_v1_noda_fp16 | 
We also officially support the following rerankers built by our group:
The following is a table specifically for our LiT5 suite of models hosted on HuggingFace:
| Model Name | Hugging Face Identifier/Link | 
|---|---|
| LiT5 Distill base | castorini/LiT5-Distill-base | 
| LiT5 Distill large | castorini/LiT5-Distill-large | 
| LiT5 Distill xl | castorini/LiT5-Distill-xl | 
| LiT5 Distill base v2 | castorini/LiT5-Distill-base-v2 | 
| LiT5 Distill large v2 | castorini/LiT5-Distill-large-v2 | 
| LiT5 Distill xl v2 | castorini/LiT5-Distill-xl-v2 | 
| LiT5 Score base | castorini/LiT5-Score-base | 
| LiT5 Score large | castorini/LiT5-Score-large | 
| LiT5 Score xl | castorini/LiT5-Score-xl | 
Now you can run top-100 reranking with the v2 model in a single pass while maintaining efficiency!
The following is a table specifically for our monoT5 suite of models hosted on HuggingFace:
| Model Name | Hugging Face Identifier/Link | 
|---|---|
| monoT5 Small MSMARCO 10K | castorini/monot5-small-msmarco-10k | 
| monoT5 Small MSMARCO 100K | castorini/monot5-small-msmarco-100k | 
| monoT5 Base MSMARCO | castorini/monot5-base-msmarco | 
| monoT5 Base MSMARCO 10K | castorini/monot5-base-msmarco-10k | 
| monoT5 Large MSMARCO 10K | castorini/monot5-large-msmarco-10k | 
| monoT5 Large MSMARCO | castorini/monot5-large-msmarco | 
| monoT5 3B MSMARCO 10K | castorini/monot5-3b-msmarco-10k | 
| monoT5 3B MSMARCO | castorini/monot5-3b-msmarco | 
| monoT5 Base Med MSMARCO | castorini/monot5-base-med-msmarco | 
| monoT5 3B Med MSMARCO | castorini/monot5-3b-med-msmarco | 
We recommend the Med models for biomedical retrieval. We also provide both 10K (generally better OOD effectiveness) and 100K checkpoints (better in-domain).
If you use RankLLM, please cite the following relevant papers:
@ARTICLE{pradeep2023rankvicuna,
  title   = {{RankVicuna}: Zero-Shot Listwise Document Reranking with Open-Source Large Language Models},
  author  = {Ronak Pradeep and Sahel Sharifymoghaddam and Jimmy Lin},
  year    = {2023},
  journal = {arXiv:2309.15088}
}
[2312.02724] RankZephyr: Effective and Robust Zero-Shot Listwise Reranking is a Breeze!
@ARTICLE{pradeep2023rankzephyr,
  title   = {{RankZephyr}: Effective and Robust Zero-Shot Listwise Reranking is a Breeze!},
  author  = {Ronak Pradeep and Sahel Sharifymoghaddam and Jimmy Lin},
  year    = {2023},
  journal = {arXiv:2312.02724}
}
If you use one of the LiT5 models please cite the following relevant paper:
@ARTICLE{tamber2023scaling,
  title   = {Scaling Down, LiTting Up: Efficient Zero-Shot Listwise Reranking with Seq2seq Encoder-Decoder Models},
  author  = {Manveer Singh Tamber and Ronak Pradeep and Jimmy Lin},
  year    = {2023},
  journal = {arXiv:2312.16098}
}
If you use one of the monoT5 models please cite the following relevant paper:
@ARTICLE{pradeep2021emd,
  title = {The Expando-Mono-Duo Design Pattern for Text Ranking with Pretrained Sequence-to-Sequence Models},
  author = {Ronak Pradeep and Rodrigo Nogueira and Jimmy Lin},
  year = {2021},
  journal = {arXiv:2101.05667},
}
This research is supported in part by the Natural Sciences and Engineering Research Council (NSERC) of Canada.