Skip to content
View wdan's full-sized avatar
💤
💤

Block or report wdan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

Python 8,780 955 Updated Jul 8, 2025

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 2,869 311 Updated Jan 6, 2026

Minimal reproduction of OneRec

Python 840 120 Updated Jan 4, 2026

BACCS Cinema Website (building)

CSS 2 2 Updated Dec 8, 2025

Fast and memory-efficient exact attention

Python 21,526 2,271 Updated Jan 10, 2026

Helpful tools and examples for working with flex-attention

Python 1,107 70 Updated Jan 8, 2026

The best ChatGPT that $100 can buy.

Python 40,034 5,131 Updated Jan 8, 2026

An digital version of boardgame Deep Future

Lua 167 12 Updated Jan 9, 2026

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

Python 4,556 384 Updated Jan 9, 2026

FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.

Python 328 19 Updated Nov 2, 2025

Nano vLLM

Python 10,671 1,358 Updated Nov 3, 2025

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,568 2,014 Updated Nov 1, 2025

The official implementation of Autoregressive Image Generation using Residual Quantization (CVPR '22)

Jupyter Notebook 988 109 Updated Jan 3, 2024

Complete solutions to the Programming Massively Parallel Processors Edition 4

Jupyter Notebook 633 83 Updated Jun 18, 2025

AlphaFold 3 inference pipeline.

Python 7,435 1,065 Updated Jan 9, 2026

Advanced NLP, Spring 2025 https://cmu-l3.github.io/anlp-spring2025/

Jupyter Notebook 70 21 Updated Mar 28, 2025

Home for "How To Scale Your Model", a short blog-style textbook about scaling LLMs on TPUs

HTML 803 116 Updated Jan 10, 2026

Repository of Jupyter notebook tutorials for teaching the Deep Learning Course at the University of Amsterdam (MSc AI), Fall 2023

Jupyter Notebook 3,063 664 Updated Oct 31, 2025

A Quirky Assortment of CuTe Kernels

Python 743 70 Updated Jan 7, 2026

My learning notes for ML SYS.

Python 5,000 325 Updated Jan 8, 2026

🚀 Awesome System for Machine Learning ⚡️ AI System Papers and Industry Practice. ⚡️ System for Machine Learning, LLM (Large Language Model), GenAI (Generative AI). 🍻 OSDI, NSDI, SIGCOMM, SoCC, MLSy…

3,553 360 Updated Jul 25, 2025

Repository hosting code for "Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations" (https://arxiv.org/abs/2402.17152).

Python 1,660 330 Updated Jan 10, 2026

🚀 Efficient implementations of state-of-the-art linear attention models

Python 4,211 349 Updated Jan 10, 2026

Analyze computation-communication overlap in V3/R1.

1,132 144 Updated Mar 21, 2025

Expert Parallelism Load Balancer

Python 1,330 196 Updated Mar 24, 2025

A bidirectional pipeline parallelism algorithm for computation-communication overlap in DeepSeek V3/R1 training.

Python 2,901 312 Updated Mar 10, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 6,046 792 Updated Jan 6, 2026

DeepEP: an efficient expert-parallel communication library

Cuda 8,874 1,056 Updated Dec 29, 2025

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 11,961 926 Updated Dec 15, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,948 288 Updated May 15, 2025
Next