Stars
Remake of the original Super Mario Bros game.
Learning in infinite dimension with neural operators.
A linear algebra library for the Zig programming language
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
An intuitive and low-overhead instrumentation tool for Python
Pytorch implementation for MeanFlow
Ongoing research training transformer models at scale
VideoNSA: Native Sparse Attention Scales Video Understanding
A library of GPU kernels for sparse matrix operations.
Speedup the attention computation of Swin Transformer
A tiny deep learning training framework implemented from scratch in C++ that follows PyTorch's API.
Pytorch Implementation (unofficial) of the paper "Mean Flows for One-step Generative Modeling" by Geng et al.
A Python toolkit for fine-tuning Geospatial Foundation Models (GFMs).
[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
Fast and memory-efficient exact attention
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
generation of training-optimised weather datasets from declarative syntax