Skip to content
View WoosukKwon's full-sized avatar

Highlights

  • Pro

Block or report WoosukKwon

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Easy, Fast, and Scalable Multimodal AI

Python 31 4 Updated Nov 14, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 15,650 2,526 Updated Nov 14, 2025

TPU inference for vLLM, with unified JAX and PyTorch support.

Python 161 31 Updated Nov 14, 2025

SkyRL: A Modular Full-stack RL Library for LLMs

Python 1,197 167 Updated Nov 13, 2025

Post-training with Tinker

Python 1,940 151 Updated Nov 14, 2025

[NeurIPS 2025] Scaling Speculative Decoding with Lookahead Reasoning

Python 51 6 Updated Oct 31, 2025

Common recipes to run vLLM

Jupyter Notebook 225 78 Updated Nov 13, 2025

Open-source implementation of AlphaEvolve

Python 4,535 669 Updated Nov 12, 2025

Nano vLLM

Python 8,903 1,076 Updated Nov 3, 2025

Achieve state of the art inference performance with modern accelerators on Kubernetes

Shell 2,041 232 Updated Nov 13, 2025

A Datacenter Scale Distributed Inference Serving Framework

Rust 5,473 690 Updated Nov 14, 2025

ArcticInference: vLLM plugin for high-throughput, low-latency inference

Python 300 37 Updated Nov 11, 2025

MiMo: Unlocking the Reasoning Potential of Language Model – From Pretraining to Posttraining

Python 1,621 68 Updated Jun 5, 2025

Democratizing Reinforcement Learning for LLMs

Jupyter Notebook 4,721 444 Updated Nov 14, 2025

[ACL 2025 Long Main] Language Model Fine-Tuning on Scaled Survey Data for Predicting Distributions of Public Opinions

Python 36 9 Updated Apr 21, 2025

NumPy aware dynamic Python compiler using LLVM

Python 10,725 1,205 Updated Nov 14, 2025

[NeurIPS 2025] Simple extension on vLLM to help you speed up reasoning model without training.

Python 205 27 Updated May 31, 2025

A collection of GPT system prompts and various prompt injection/leaking knowledge.

HTML 9,856 1,381 Updated Oct 24, 2025

FAIR Sequence Modeling Toolkit 2

Python 1,074 124 Updated Nov 14, 2025

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 4,265 428 Updated Nov 14, 2025

Fast, Flexible and Portable Structured Generation

C++ 1,375 99 Updated Nov 9, 2025

Helpful tools and examples for working with flex-attention

Python 1,053 64 Updated Nov 13, 2025

Efficient Triton Kernels for LLM Training

Python 5,836 431 Updated Nov 11, 2025

A throughput-oriented high-performance serving framework for LLMs

Jupyter Notebook 914 44 Updated Oct 29, 2025

Enabling PyTorch on XLA Devices (e.g. Google TPU)

C++ 2,702 560 Updated Nov 14, 2025

A minimal implementation of vllm.

Cuda 60 1 Updated Jul 27, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 108 50 Updated Nov 14, 2025

PyTorch native quantization and sparsity for training and inference

Python 2,504 369 Updated Nov 14, 2025
Next