-
NVIDIA
- Huntsville, AL
-
19:14
(UTC -06:00) - in/matt-nicely
Highlights
- Pro
Stars
FlashInfer: Kernel Library for LLM Serving
NCCL communication API layer, and transport layer created from first principles.
Optimized primitives for collective multi-GPU communication
cudnn_frontend provides a c++ wrapper for the cudnn backend API and samples on how to use it
A nvImageCodec library of GPU- and CPU- accelerated codecs featuring a unified interface
TRaSH-Guides is a comprehensive collection of guides for Radarr, Sonarr, and related media management applications.
A list of awesome compiler projects and papers for tensor computation and deep learning.
The official PyTorch implementation of the paper "Human Motion Diffusion Model"
CUDA Templates and Python DSLs for High-Performance Linear Algebra
[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl