Skip to content
View duoduoyeah's full-sized avatar

Highlights

  • Pro

Block or report duoduoyeah

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Material for gpu-mode lectures

Jupyter Notebook 5,355 538 Updated Nov 21, 2025

Open Source DeepWiki: AI-Powered Wiki Generator for GitHub/Gitlab/Bitbucket Repositories. Join the discord: https://discord.gg/gMwThUMeme

Python 12,410 1,345 Updated Nov 25, 2025

Large Language Model Text Generation Inference

Python 10,674 1,243 Updated Nov 19, 2025

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 2,283 294 Updated Nov 27, 2025

UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)

C++ 1,100 102 Updated Nov 28, 2025

[ICLR 2025 Oral] Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Python 897 50 Updated Jul 10, 2025

Fast and memory-efficient exact attention

Python 20,787 2,170 Updated Nov 25, 2025

A framework for few-shot evaluation of language models.

Python 10,775 2,879 Updated Nov 27, 2025

AWM: Agent Workflow Memory

Python 359 30 Updated Jan 31, 2025

Code and Data for Tau-Bench

Python 969 156 Updated Aug 28, 2025

🌐 Jekyll is a blog-aware static site generator in Ruby

Ruby 51,139 10,246 Updated Nov 7, 2025

A lightweight data processing framework built on DuckDB and 3FS.

Python 4,847 431 Updated Mar 5, 2025

Speed Always Wins: A Survey on Efficient Architectures for Large Language Models

364 31 Updated Nov 11, 2025

Constrained Decoding of Diffusion LLMs with Context-Free Grammars.

Rust 31 3 Updated Nov 12, 2025

Heterogeneous AI Computing Virtualization Middleware(Project under CNCF)

Go 2,685 427 Updated Nov 27, 2025

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

Python 3,090 238 Updated Nov 28, 2025

A curated list of papers related to constrained decoding of LLM, along with their relevant code and resources.

294 13 Updated Oct 15, 2025

The most open diffusion language model for code generation — releasing pretraining, evaluation, inference, and checkpoints.

Python 464 32 Updated Nov 11, 2025

Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache).

Python 185 14 Updated Nov 17, 2025

Official implementation of "DPad: Efficient Diffusion Language Models with Suffix Dropout"

Python 52 5 Updated Nov 22, 2025

[NeurIPS'25] dKV-Cache: The Cache for Diffusion Language Models

Python 120 9 Updated May 22, 2025

A collection of papers on discrete diffusion models

166 2 Updated Jun 30, 2025

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Python 2,435 291 Updated Nov 28, 2025

📚A curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Cache, Quantization, Parallelism, etc.🎉

Python 453 24 Updated Nov 28, 2025

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 4,760 324 Updated Nov 28, 2025

Official PyTorch implementation for ICLR2025 paper "Scaling up Masked Diffusion Models on Text"

Python 343 24 Updated Dec 22, 2024

Official PyTorch implementation for "Large Language Diffusion Models"

Python 3,321 223 Updated Nov 12, 2025

⚡FlashRAG: A Python Toolkit for Efficient RAG Research (WWW2025 Resource)

Python 3,179 271 Updated Nov 26, 2025

LLM training in simple, raw C/CUDA

Cuda 28,267 3,298 Updated Jun 26, 2025

A library for efficient similarity search and clustering of dense vectors.

C++ 38,180 4,137 Updated Nov 24, 2025
Next