-
University of Virginia
- Charlottesville, VA
- http://tddg.github.io
- https://ds2-lab.github.io/
- @yuecheng87
- in/yue-cheng
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
A Claude Code plugin that automatically captures everything Claude does during your coding sessions, compresses it with AI (using Claude's agent-sdk), and injects relevant context back into future …
A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.
Create Epic Math and Physics Animations & Study Notes From Text and Images.
💫 Toolkit to help you get started with Spec-Driven Development
Open-source implementation of AlphaEvolve
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.
Heterogeneous AI Computing Virtualization Middleware(Project under CNCF)
What if we could pack single purpose, powerful AI Agents into a single python file?
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Inspect a command's effects before modifying your live system
21 Lessons, Get Started Building with Generative AI
A tiny yet powerful LLM inference system tailored for researching purpose. vLLM-equivalent performance with only 2k lines of code (2% of vLLM).
Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals
A command-line productivity tool powered by AI large language models like GPT-4, will help you accomplish your tasks faster and more efficiently.
Envision a future where every student can read all the code of a teaching operating system.
Sparsity-aware deep learning inference runtime for CPUs
LLM Serving Performance Evaluation Harness
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Efficient and easy multi-instance LLM serving
Awesome LLM compression research papers and tools.
ALPS: An Adaptive Learning, Priority OS Scheduler for Serverless Functions (USENIX ATC'24)
Algorithmic complexity attacks for dynamic learned indexes (VLDB'24)