Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

FP8 KV cache append optimized with precomputed inverse scale
#31514 opened Dec 30, 2025 by tfpre Loading…
5 tasks
[Model] Enable LoRA support for tower and connector in LLaVA documentation Improvements or additions to documentation
#31513 opened Dec 30, 2025 by jayhemnani9910 Loading…
[Hardware][Power] Add IBM MASS + NUMA optimizations for POWER8 cpu Related to CPU backends documentation Improvements or additions to documentation
#31512 opened Dec 30, 2025 by Scottcjn Loading…
4 tasks
[Core] Deduplicate generate/encode logic in AsyncLLM frontend ready ONLY add when PR is ready to merge/full CI is needed v1
#31510 opened Dec 29, 2025 by njhill Loading…
[2/n] Migrate kernels to libtorch stable ABI ci/build cpu Related to CPU backends nvidia
#31509 opened Dec 29, 2025 by mikaylagawarecki Draft
5 tasks
[Minor] Various small code cleanups/simplifications frontend multi-modality Related to multi-modality (#4194) ready ONLY add when PR is ready to merge/full CI is needed structured-output v1
#31508 opened Dec 29, 2025 by njhill Loading…
Add embedding input functionality for disabled modalities multi-modality Related to multi-modality (#4194) tpu Related to Google TPUs v1
#31506 opened Dec 29, 2025 by reaganjlee Draft
5 tasks
Migrate meetups & sponsors [2/N] documentation Improvements or additions to documentation
#31500 opened Dec 29, 2025 by esmeetu Loading…
5 tasks
[CI][NIXL] Split DPEP tests ci/build kv-connector ready ONLY add when PR is ready to merge/full CI is needed v1
#31491 opened Dec 29, 2025 by NickLucche Loading…
[LoRA] Hide lora_init_id in request.py documentation Improvements or additions to documentation frontend llama Related to Llama models performance Performance-related issues v1
#31489 opened Dec 29, 2025 by jeejeelee Draft
5 tasks
[log] enable max_log_len trim only when needed frontend
#31482 opened Dec 29, 2025 by andyxning Loading…
5 tasks
Add docker buildx bake configuration
#31477 opened Dec 29, 2025 by amrmahdi Loading…
FlashInferUnification nvidia
#31476 opened Dec 29, 2025 by rajanyadav0307 Loading…
5 tasks done
[Model] Add HyperCLOVAX-SEED-Think-32B vision-language model support new-model Requests to new models
#31471 opened Dec 29, 2025 by effortprogrammer Draft
2 of 3 tasks
feat: Add per-layer MLP size support for Qwen pruning documentation Improvements or additions to documentation qwen Related to Qwen models
#31468 opened Dec 29, 2025 by CedricHwong Loading…
4 of 5 tasks
fixed mypy warnings for files under vllm/v1/attention nvidia rocm Related to AMD ROCm v1
#31465 opened Dec 29, 2025 by MrIceCreamMan Loading…
3 of 5 tasks
Add Cogagent Model to vllm documentation Improvements or additions to documentation new-model Requests to new models v1
#31463 opened Dec 29, 2025 by JBurtn Draft
1 of 5 tasks
ProTip! Follow long discussions with comments:>50.