Starred repositories
deepspeedai / Megatron-DeepSpeed
Forked from NVIDIA/Megatron-LMOngoing research training transformer language models at scale, including: BERT & GPT-2
Optimized primitives for collective multi-GPU communication
"rsync for cloud storage" - Google Drive, S3, Dropbox, Backblaze B2, One Drive, Swift, Hubic, Wasabi, Google Cloud Storage, Azure Blob, Azure Files, Yandex Files
主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题
Machine Learning Engineering Open Book
AuctionGym is a simulation environment that enables reproducible evaluation of bandit and reinforcement learning methods for online advertising auctions.
A universal scalable machine learning model deployment solution
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
A collection of LogitsProcessors to customize and enhance LLM behavior for specific tasks.
The Finch CLI is an open source client for container development
A intuitive, lightweight web framework in C for building modern web applications
An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
Define Kubernetes native apps and abstractions using object-oriented programming
Fast, Flexible and Portable Structured Generation
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
Triton backend that enables pre-process, post-processing and other logic to be implemented in Python.
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
Kubernetes Operator for MPI-based applications (distributed training, HPC, etc.)
Work with remote images registries - retrieving information, images, signing content
Universal LLM Deployment Engine with ML Compilation
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.