Skip to content
View merrymercy's full-sized avatar
:octocat:
:octocat:

Sponsors

@Ying1123
@HaiShaw
@amiruci

Highlights

  • Pro

Organizations

@apache @dmlc @ucbrise @alpa-projects @lm-sys

Block or report merrymercy

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 317 23 Updated Nov 28, 2025

JAX backend for SGL

Python 185 36 Updated Nov 27, 2025

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Python 3,250 333 Updated Oct 11, 2025

Kimi K2 is the large language model series developed by Moonshot AI team

9,609 682 Updated Nov 7, 2025

OME is a Kubernetes operator for enterprise-grade management and serving of Large Language Models (LLMs)

Go 322 46 Updated Nov 28, 2025

Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serving systems.

Python 234 35 Updated Nov 25, 2025

slime is an LLM post-training framework for RL Scaling.

Python 2,602 289 Updated Nov 28, 2025
Python 918 96 Updated Nov 22, 2025

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 4,331 445 Updated Nov 27, 2025

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

Python 3,090 238 Updated Nov 28, 2025

Expander, an open-source GKR prover designed for scaling large-scale parallel computing.

Rust 136 53 Updated Sep 18, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 63,196 7,642 Updated Nov 27, 2025

The source of LMSYS website and blogs

JavaScript 70 56 Updated Nov 26, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 16,767 2,667 Updated Nov 27, 2025

Fast low-bit matmul kernels in Triton

Python 401 29 Updated Nov 21, 2025

HunyuanVideo: A Systematic Framework For Large Video Generation Model

Python 11,339 1,138 Updated Nov 21, 2025

MiniCPM4 & MiniCPM4.1: Ultra-Efficient LLMs on End Devices, achieving 3+ generation speedup on reasoning tasks

Jupyter Notebook 8,436 522 Updated Oct 8, 2025

[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

Python 3,390 201 Updated Nov 17, 2025

Model Compression Toolbox for Large Language Models and Diffusion Models

Python 700 69 Updated Aug 14, 2025

Fast, Flexible and Portable Structured Generation

C++ 1,395 103 Updated Nov 27, 2025

My learning notes/codes for ML SYS.

Python 4,289 259 Updated Nov 25, 2025

如何在美国加州建立501c3非盈利组织的文档

14 2 Updated Sep 12, 2021

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)

Python 8,474 820 Updated Nov 9, 2025

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda 2,731 272 Updated Nov 6, 2025

SGLang is fast serving framework for large language models and vision language models.

Python 30 19 Updated Nov 24, 2025

Materials for learning SGLang

656 47 Updated Nov 21, 2025

SOTA Open Source TTS

Python 24,201 1,974 Updated Nov 6, 2025

Efficient Triton Kernels for LLM Training

Python 5,884 438 Updated Nov 28, 2025
Next