-
Notifications
You must be signed in to change notification settings - Fork 604
Open
Description
I am using the RULER notebook to train a model, but I get this error:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
/usr/local/lib/python3.12/dist-packages/unsloth_zoo/vllm_utils.py in load_vllm(model_name, config, gpu_memory_utilization, max_seq_length, dtype, training, float8_kv_cache, random_state, enable_lora, max_lora_rank, max_loras, use_async, use_engine, disable_log_stats, enforce_eager, enable_prefix_caching, compilation_config, conservativeness, max_logprobs, use_bitsandbytes, unsloth_vllm_standby, return_args)
1499 if use_async:
-> 1500 llm = AsyncLLMEngine.from_engine_args(AsyncEngineArgs(**engine_args))
1501 elif use_engine:
31 frames
RuntimeError: torch.cuda.MemPool doesn't currently support expandable_segments.
During handling of the above exception, another exception occurred:
RuntimeError Traceback (most recent call last)
/usr/local/lib/python3.12/dist-packages/unsloth_zoo/vllm_utils.py in load_vllm(model_name, config, gpu_memory_utilization, max_seq_length, dtype, training, float8_kv_cache, random_state, enable_lora, max_lora_rank, max_loras, use_async, use_engine, disable_log_stats, enforce_eager, enable_prefix_caching, compilation_config, conservativeness, max_logprobs, use_bitsandbytes, unsloth_vllm_standby, return_args)
1525 )
1526 else:
-> 1527 raise RuntimeError(error)
1528 pass
1529 pass
RuntimeError: torch.cuda.MemPool doesn't currently support expandable_segments.
I have tried upgrading transformers, vllm, and ART, and I have also tried multiple models, including GPT-OSS 20b and Qwen/Qwen2.5-7B-Instruct, but nothing resolved this issue. Here is my notebook's code: https://colab.research.google.com/drive/13Ax7eQ313WxTHXzosUciHdBYXlnG9047?usp=sharing
StupidYoshiaki
Metadata
Metadata
Assignees
Labels
No labels