Stars
⛄ Possibly the smallest compiler ever
Foam-Agent: An end-to-end, composable multi-agent framework for automating CFD simulations in OpenFOAM. NeurIPS 2025 Machine Learning and the Physical Sciences Workshop.
Remake of the original Super Mario Bros game.
Learning in infinite dimension with neural operators.
A linear algebra library for the Zig programming language
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
An intuitive and low-overhead instrumentation tool for Python
Pytorch implementation for MeanFlow
Ongoing research training transformer models at scale
VideoNSA: Native Sparse Attention Scales Video Understanding
A library of GPU kernels for sparse matrix operations.
Speedup the attention computation of Swin Transformer
A tiny deep learning training framework implemented from scratch in C++ that follows PyTorch's API.
Pytorch Implementation (unofficial) of the paper "Mean Flows for One-step Generative Modeling" by Geng et al.
A Python toolkit for fine-tuning Geospatial Foundation Models (GFMs).
[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
Fast and memory-efficient exact attention
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"