RuntimeError: torch.cuda.MemPool doesn't currently support expandable_segments during vLLM model initialization

Description

## Summary
When training a LangGraph agent with `openpipe-art[backend,langgraph]`, the process fails at model initialization with the following error:



RuntimeError: torch.cuda.MemPool doesn't currently support expandable_segments.


The error occurs inside vLLM when allocating CUDA parameters during model initialization.

## Environment
- OS: Linux
- GPUs: 2x NVIDIA L4 (23 GB each)
- CUDA: 12.4 (`nvcc --version` shows Cuda compilation tools, release 12.4, V12.4.131)
- NVIDIA driver: 550.90.07
- Python: 3.12.x (venv with `uv`)
- Installed via: `pip install openpipe-art[backend,langgraph]`
- Dependency versions (from uv.lock):
  - torch==2.7.1
  - vllm==0.10.0

## Steps to reproduce
1. Create a new Python 3.12 virtual environment.
2. `uv add openpipe-art[backend,langgraph]>=0.4.11`
3. Run training (which calls `art.model.register()`).
4. Observe the crash at model initialization.

## Logs 
File ".../vllm/model_executor/layers/vocab_parallel_embedding.py", line 34, in init
weight = Parameter(torch.empty(sum(output_partition_sizes), ...))
RuntimeError: torch.cuda.MemPool doesn't currently support expandable_segments.


## Request
- Please confirm if the current pinned torch (2.7.1) + vllm (0.10.0) combination is expected to work with CUDA 12.4 / L4 GPUs.
- If not, could you provide a tested torch/vllm/xformers pinset for CUDA 12.4?
- Alternatively, handle this error in vLLM (or document required versions) so users don’t hit this blocker.

Happy to provide full logs (`pip freeze`, `nvcc`, etc.) if needed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RuntimeError: torch.cuda.MemPool doesn't currently support expandable_segments during vLLM model initialization #405

Summary

Environment

Steps to reproduce

Logs

Request

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RuntimeError: torch.cuda.MemPool doesn't currently support expandable_segments during vLLM model initialization #405

Description

Summary

Environment

Steps to reproduce

Logs

Request

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions