Skip to content
View leofang's full-sized avatar

Highlights

  • Pro

Organizations

@NVIDIA @mpi4py @conda-forge @cupy @rapidsai

Block or report leofang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

cuTile is a programming model for writing parallel kernels for NVIDIA GPUs

Python 1,797 95 Updated Jan 9, 2026

Nsight Python is a Python kernel profiling interface based on NVIDIA Nsight Tools

Python 89 7 Updated Jan 9, 2026

A conda plugin which creates NVIDIA-specific virtual packages

Python 7 Updated Nov 12, 2025

NVIDIA NVSHMEM is a parallel programming interface for NVIDIA GPUs based on OpenSHMEM. NVSHMEM can significantly reduce multi-process communication and coordination overheads by allowing programmer…

C++ 443 52 Updated Dec 31, 2025

Schema validation just got Pythonic

Python 2,940 215 Updated Oct 26, 2025

Vector classes and utilities

Python 95 36 Updated Jan 5, 2026

Linter that finds portability issues in Python package distributions (wheels, sdists, conda packages).

Python 44 4 Updated Jan 10, 2026

NumPy & SciPy for GPU

Python 10,711 982 Updated Jan 10, 2026

Manipulating ragged arrays in an Array API compliant way.

Python 45 8 Updated Dec 14, 2025

A single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).

C++ 568 73 Updated Sep 15, 2025

Reusable GitHub Actions workflows for RAPIDS CI

Shell 7 25 Updated Jan 9, 2026

Let your Claude able to think

TypeScript 16,689 1,968 Updated Nov 4, 2025

Experimental projects related to TensorRT

MLIR 117 22 Updated Jan 10, 2026

Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks

C++ 23,090 1,204 Updated Jan 9, 2026

Dr.Jit — A Just-In-Time-Compiler for Differentiable Rendering

C++ 733 55 Updated Jan 10, 2026

Python library for generating high-performance implementations of stencil kernels for weather and climate modeling from a domain-specific language (DSL).

Python 136 54 Updated Jan 8, 2026

The CUDA target for Numba

Python 240 54 Updated Jan 9, 2026

A Python module for decorators, wrappers and monkey patching.

Python 2,251 244 Updated Jan 1, 2026

Download Taiwan financial market data via FMD API.

Python 29 Updated Mar 6, 2025

DaCe - Data Centric Parallel Programming

Python 570 149 Updated Jan 11, 2026

A retargetable MLIR-based machine learning compiler and runtime toolkit.

C++ 3,548 820 Updated Jan 11, 2026

Python disk-backed cache (Django-compatible). Faster than Redis and Memcached. Pure-Python.

Python 2,776 155 Updated Aug 10, 2024

NVIDIA curated collection of educational resources related to general purpose GPU programming.

Jupyter Notebook 1,070 190 Updated Jan 9, 2026

NVIDIA Math Libraries for the Python Ecosystem

Cython 541 31 Updated Nov 17, 2025

芫荽,基於 Klee One 改造的學習用台灣繁體字型

Python 1,925 69 Updated May 30, 2025

GPU Development in Python 101 tutorial

Jupyter Notebook 276 68 Updated Oct 15, 2024

A library for detecting, labeling, and reasoning about microarchitectures

Python 124 32 Updated Jan 7, 2026

JupyterLite demo deployed to GitHub Pages 🚀

Jupyter Notebook 413 240 Updated Dec 16, 2025

A massively parallel, high-level programming language

Rust 19,130 470 Updated Jun 3, 2025

A cross-version Python bytecode decompiler

Python 4,201 450 Updated Jan 10, 2026
Next