-
FlashMLABase Public
Forked from deepseek-ai/FlashMLAFlashMLA: Efficient MLA decoding kernels
C++ MIT License UpdatedSep 24, 2025 -
-
-
Manifesto-against-the-Plagiarist-Yunhe-Wang Public
Forked from knemik97/Manifesto-against-the-Plagiarist-Yunhe-Wang讨贼王云鹤檄文
Apache License 2.0 UpdatedJul 8, 2025 -
LeetCUDA Public
Forked from xlite-dev/LeetCUDA📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA/Tensor Cores Kernels, HGEMM, FA-2 MMA.🎉
Cuda GNU General Public License v3.0 UpdatedJun 29, 2025 -
-
-
minimind Public
Forked from jingyaogong/minimind🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!
Python Apache License 2.0 UpdatedApr 30, 2025 -
sglang Public
Forked from sgl-project/sglangSGLang is yet another fast serving framework for large language models and vision language models.
Python Apache License 2.0 UpdatedApr 29, 2025 -
-
-
cute-flash-attention Public
Forked from luliyucoordinate/cute-flash-attentionImplement Flash Attention using Cute.
Cuda UpdatedDec 17, 2024 -
fast.cu Public
Forked from pranjalssh/fast.cuFastest kernels written from scratch
Cuda MIT License UpdatedNov 30, 2024 -
lectures Public
Forked from gpu-mode/lecturesMaterial for gpu-mode lectures
Jupyter Notebook Apache License 2.0 UpdatedNov 23, 2024 -
applied-ai Public
Forked from meta-pytorch/applied-aiApplied AI experiments and examples for PyTorch
Python BSD 3-Clause "New" or "Revised" License UpdatedNov 15, 2024 -
how-to-optim-algorithm-in-cuda Public
Forked from BBuf/how-to-optim-algorithm-in-cudahow to optimize some algorithm in cuda.
Cuda UpdatedNov 12, 2024 -
FlagGems Public
Forked from flagos-ai/FlagGemsFlagGems is an operator library for large language models implemented in Triton Language.
Python Apache License 2.0 UpdatedSep 9, 2024 -
transformers Public
Forked from huggingface/transformers🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Python Apache License 2.0 UpdatedAug 21, 2024 -
llm_interview_note Public
Forked from wdndev/llm_interview_note主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题
HTML UpdatedAug 19, 2024 -
marlin Public
Forked from IST-DASLab/marlinFP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
Python Apache License 2.0 UpdatedAug 15, 2024 -
triton Public
Forked from triton-lang/tritonDevelopment repository for the Triton language and compiler
C++ MIT License UpdatedAug 7, 2024 -
flash-attention Public
Forked from Dao-AILab/flash-attentionFast and memory-efficient exact attention
Python BSD 3-Clause "New" or "Revised" License UpdatedJul 11, 2024 -
Awesome-LLM-Inference Public
Forked from xlite-dev/Awesome-LLM-Inference📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
GNU General Public License v3.0 UpdatedJul 3, 2024 -
Mooncake Public
Forked from kvcache-ai/MooncakeMooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
UpdatedJul 2, 2024 -
gpu-benches Public
Forked from RRZE-HPC/gpu-benchescollection of benchmarks to measure basic GPU capabilities
Jupyter Notebook GNU General Public License v3.0 UpdatedJun 21, 2024 -
cutlass_cute_experiments Public
Forked from HydraQYH/cutlass_cute_experimentsCuda UpdatedJun 18, 2024 -
deepseekv2-profile Public
Forked from madsys-dev/deepseekv2-profileJupyter Notebook UpdatedMay 31, 2024 -
Llama-Chinese Public
Forked from LlamaFamily/Llama-ChineseLlama中文社区,Llama3在线体验和微调模型已开放,实时汇总最新Llama3学习资料,已将所有代码更新适配Llama3,构建最好的中文Llama大模型,完全开源可商用
Python UpdatedApr 24, 2024 -
unsloth Public
Forked from unslothai/unslothFinetune Llama 3, Mistral & Gemma LLMs 2-5x faster with 80% less memory
Python Apache License 2.0 UpdatedApr 22, 2024 -
llama-recipes Public
Forked from meta-llama/llama-cookbookScripts for fine-tuning Llama2 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization & question answering. S…
Jupyter Notebook Other UpdatedApr 13, 2024