mnicely

Matthew Nicely mnicely

Group Product Manager - Kernels & Comms [email protected]

60 followers · 5 following

NVIDIA
Huntsville, AL
19:14 (UTC -06:00)
in/matt-nicely

Achievements

Highlights

Stars

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 4,056 564 Updated Nov 13, 2025

vipulSharma18 / NCCL-From-First-Principles

NCCL communication API layer, and transport layer created from first principles.

C++ 13 Updated Aug 20, 2025

NVIDIA / nccl-tests

NCCL Tests

Cuda 1,331 329 Updated Nov 3, 2025

Dao-AILab / quack

A Quirky Assortment of CuTe Kernels

Python 653 60 Updated Oct 30, 2025

NVIDIA / nccl

Optimized primitives for collective multi-GPU communication

C++ 4,221 1,064 Updated Nov 10, 2025

NVIDIA / cudnn-frontend

cudnn_frontend provides a c++ wrapper for the cudnn backend API and samples on how to use it

C++ 638 134 Updated Nov 7, 2025

NVIDIA / nvImageCodec

A nvImageCodec library of GPU- and CPU- accelerated codecs featuring a unified interface

Jupyter Notebook 123 9 Updated Aug 12, 2025

TRaSH-Guides / Guides

TRaSH-Guides is a comprehensive collection of guides for Radarr, Sonarr, and related media management applications.

Shell 2,563 287 Updated Nov 12, 2025

merrymercy / awesome-tensor-compilers

A list of awesome compiler projects and papers for tensor computation and deep learning.

2,671 320 Updated Oct 19, 2024

GuyTevet / motion-diffusion-model

The official PyTorch implementation of the paper "Human Motion Diffusion Model"

Python 3,741 420 Updated Oct 1, 2025

NVIDIA / cutlass

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 8,769 1,523 Updated Nov 10, 2025

NVIDIA / rtx_compute_samples

RTX compute samples

C++ 70 13 Updated Jun 17, 2023

NVIDIA / cub

[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl

Cuda 1,801 463 Updated Oct 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Matthew Nicely mnicely

Achievements

Achievements

Highlights

Block or report mnicely

Stars

flashinfer-ai / flashinfer

vipulSharma18 / NCCL-From-First-Principles

NVIDIA / nccl-tests

Dao-AILab / quack

NVIDIA / nccl

NVIDIA / cudnn-frontend

NVIDIA / nvImageCodec

TRaSH-Guides / Guides

merrymercy / awesome-tensor-compilers

GuyTevet / motion-diffusion-model

NVIDIA / cutlass

NVIDIA / rtx_compute_samples

NVIDIA / cub