Provide a mock instance to test vLLM without CUDA or any GPUs.
- Github repository: https://github.com/vkehfdl1/vllm-mock/
vllm.LLM.generatemockvllm.LLM.chatmock
It is highly recommended to use the mock instance with pytest-mock.
from vllm_mock import LLM
from vllm import SamplingParams
def test_vllm(mocker):
mock_class = mocker.patch("vllm.LLM")
mock_class.return_value = LLM(model="mock-model")
llm = mock_class()
sampling_params = SamplingParams(temperature=0.8, top_p=0.9, logprobs=1)
response = llm.generate("Hello, world!", sampling_params=sampling_params)
assert isinstance(response[0].outputs[0].text, str)
chat_response = llm.chat([
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello, world!"}
], sampling_params=sampling_params)
assert isinstance(chat_response[0].outputs[0].text, str)pip install vllm-mock pytest-mockor in a UV environment
uv add --dev vllm-mock pytest-mock- Mock vLLM API server
- Mock Reasoning model features
- Mock quantization features
- Mock LoRA features
- vLM models mock
-
vllm.LLM.beam_searchmock -
vllm.LLM.embedmock -
vllm.LLM.classifymock -
vllm.LLM.encodemock -
vllm.LLM.rewardmock
First, clone a repository
git clone https://github.com/NomaDamas/vllm-mock.git
cd vllm-mockThen, install the environment and the pre-commit hooks with
make installThis will also generate your uv.lock file
Initially, the CI/CD pipeline might be failing due to formatting issues. To resolve those run:
uv run pre-commit run -aYou can create any issue or PR to support this project. Thank you!
- Jeffrey is a creator of this repo. Made this because he desperately needed it for his research.
- NomaDamas is an AI open-source Hacker House in Seoul, Korea.
Repository initiated with fpgmaas/cookiecutter-uv.