-
gemlite Public
Forked from dropbox/gemliteFast low-bit matmul kernels in Triton
Python Apache License 2.0 UpdatedOct 26, 2025 -
agents-towards-production Public
Forked from NirDiamant/agents-towards-productionThis repository delivers end-to-end, code-first tutorials covering every layer of production-grade GenAI agents, guiding you from spark to scale with proven patterns and reusable blueprints for re…
Jupyter Notebook Other UpdatedOct 26, 2025 -
LLMs-from-scratch Public
Forked from rasbt/LLMs-from-scratchImplementing a ChatGPT-like LLM in PyTorch from scratch, step by step
Jupyter Notebook Other UpdatedOct 13, 2025 -
ai-engineering-hub Public
Forked from patchy631/ai-engineering-hubIn-depth tutorials on LLMs, RAGs and real-world AI agent applications.
Jupyter Notebook MIT License UpdatedOct 8, 2025 -
Embodied-AI-Guide Public
Forked from TianxingChen/Embodied-AI-Guide[Lumina Embodied AI] 具身智能技术指南 Embodied-AI-Guide
Other UpdatedSep 22, 2025 -
ml-engineering Public
Forked from stas00/ml-engineeringMachine Learning Engineering Open Book
Python Creative Commons Attribution Share Alike 4.0 International UpdatedSep 2, 2025 -
-
AI-Infra-from-Zero-to-Hero Public
Forked from HuaizhengZhang/AI-Infra-from-Zero-to-Hero🚀 Awesome System for Machine Learning ⚡️ AI System Papers and Industry Practice. ⚡️ System for Machine Learning, LLM (Large Language Model), GenAI (Generative AI). 🍻 OSDI, NSDI, SIGCOMM, SoCC, MLSy…
MIT License UpdatedJul 25, 2025 -
ai-infra-hpc Public
Forked from jinbooooom/ai-infra-hpchpc 教程,包含集合通信(mpi、nccl)、cuda 编程、向量化 SIMD、RDMA 通信等
Cuda MIT License UpdatedJul 24, 2025 -
AISystem Public
Forked from Infrasys-AI/AISystemAISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
Jupyter Notebook Apache License 2.0 UpdatedJul 6, 2025 -
-
trt-samples-for-hackathon-cn Public
Forked from NVIDIA/trt-samples-for-hackathon-cnSimple samples for TensorRT programming
Python Apache License 2.0 UpdatedMay 28, 2025 -
LLM-engineer-handbook Public
Forked from SylphAI-Inc/LLM-engineer-handbookA curated list of Large Language Model resources, covering model training, serving, fine-tuning, and building LLM applications.
MIT License UpdatedMay 26, 2025 -
-
cuda-python Public
Forked from NVIDIA/cuda-pythonCUDA Python: Performance meets Productivity
Python Other UpdatedApr 16, 2025 -
-
accelerated-computing-hub Public
Forked from NVIDIA/accelerated-computing-hubNVIDIA curated collection of educational resources related to general purpose GPU programming.
Jupyter Notebook Other UpdatedApr 9, 2025 -
scikit-build-core Public
Forked from scikit-build/scikit-build-coreA next generation Python CMake adaptor and Python API for plugins
Python Apache License 2.0 UpdatedApr 9, 2025 -
AITemplate Public
Forked from facebookincubator/AITemplateAITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
Python Apache License 2.0 UpdatedApr 1, 2025 -
dynamo Public
Forked from ai-dynamo/dynamoA Datacenter Scale Distributed Inference Serving Framework
Rust Apache License 2.0 UpdatedMar 25, 2025 -
ragflow Public
Forked from infiniflow/ragflowRAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
TypeScript Apache License 2.0 UpdatedMar 18, 2025 -
lectures Public
Forked from gpu-mode/lecturesMaterial for gpu-mode lectures
Jupyter Notebook Apache License 2.0 UpdatedMar 17, 2025 -
triton_docs_tutorials Public
Forked from evintunador/triton_docs_tutorialsmaking the official triton tutorials actually comprehensible
Python MIT License UpdatedMar 17, 2025 -
100-daysofcuda Public
Forked from prateekshukla1108/100-daysofcudaKernels Written for 100 days of CUDA Challenge
Cuda MIT License UpdatedMar 15, 2025 -
llm_engineering Public
Forked from ed-donner/llm_engineeringRepo to accompany my mastering LLM engineering course
Jupyter Notebook MIT License UpdatedMar 15, 2025 -
QAnything Public
Forked from netease-youdao/QAnythingQuestion and Answer based on Anything.
Python GNU Affero General Public License v3.0 UpdatedMar 12, 2025 -
CutlassAcademy Public
Forked from MekkCyber/CutlassAcademyA curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASS
UpdatedMar 11, 2025 -
xDiT Public
Forked from xdit-project/xDiTxDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
Python Apache License 2.0 UpdatedMar 11, 2025 -
AIInfra Public
Forked from Infrasys-AI/AIInfraAIInfra(AI 基础设施)指AI系统从底层芯片等硬件,到上层软件栈支持AI大模型训练和推理。
Python Apache License 2.0 UpdatedMar 9, 2025 -
how-to-optim-algorithm-in-cuda Public
Forked from BBuf/how-to-optim-algorithm-in-cudahow to optimize some algorithm in cuda.
Cuda UpdatedFeb 23, 2025