Skip to content
View cats256's full-sized avatar

Highlights

  • Pro

Block or report cats256

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

LinearKAN: A very fast implementation of Kolmogorov-Arnold Networks

Python 17 1 Updated Nov 12, 2025

CUDA Embedding Lookup Kernel Library

Cuda 40 5 Updated Oct 21, 2025

Development repository for the Triton language and compiler

MLIR 18,079 2,495 Updated Jan 10, 2026
Python 1,269 123 Updated Jan 9, 2026

A collection of full time roles in SWE, Quant, and PM for new grads.

16,006 1,242 Updated Jan 10, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 67,192 12,491 Updated Jan 10, 2026

Tilus is a tile-level kernel programming language with explicit control over shared memory and registers.

Python 437 16 Updated Dec 16, 2025

LeetGPU Challenges

Python 579 46 Updated Jan 3, 2026

Efficient Triton Kernels for LLM Training

Python 6,027 459 Updated Jan 7, 2026

LM engine is a library for pretraining/finetuning LLMs

Python 108 24 Updated Jan 8, 2026

A machine learning compiler for GPUs, CPUs, and ML accelerators

C++ 3,887 721 Updated Jan 10, 2026

A curated list of papers of interesting empirical study and insight on deep learning. Continually updating...

383 15 Updated Jan 7, 2026

kernels, of the mega variety

Python 641 35 Updated Sep 28, 2025

Home for "How To Scale Your Model", a short blog-style textbook about scaling LLMs on TPUs

HTML 803 116 Updated Jan 10, 2026

The AMFormer algorithm, accepted at AAAI-2024, for deep tabular learning

Python 41 10 Updated Jul 3, 2024

A modular framework for neural networks with Euclidean symmetry

Python 1,197 176 Updated Jan 9, 2026

Visualization and calculator for input & output for deep neural networks.

TypeScript 16 3 Updated Jul 28, 2025

CPU and GPU implementations of some 2D RNN layers

C++ 29 10 Updated Sep 23, 2017

You like pytorch? You like micrograd? You love tinygrad! ❤️

Python 31,103 3,830 Updated Jan 10, 2026

Chrome/Firefox extension that blocks access to distracting websites to improve your productivity.

TypeScript 367 50 Updated Nov 17, 2025

ASU-sparkysundevil-resume-template

TeX 32 21 Updated Oct 3, 2024

Tips and resources to prepare for Behavioral interviews.

7,535 1,480 Updated Aug 19, 2025

conv_visualizer

Processing 490 44 Updated Dec 1, 2024

Sample code for the Microsoft Cognitive Services Speech SDK

C# 3,381 2,000 Updated Jan 9, 2026

Flux diffusion model implementation using quantized fp8 matmul & remaining layers use faster half precision accumulate, which is ~2x faster on consumer devices.

Python 285 35 Updated Oct 12, 2024

Hypergradient descent

Python 147 21 Updated May 31, 2024

public facing repo of my algorithm running on platform

Python 4 Updated Nov 20, 2024

QuantSC Spring '23 Project

Jupyter Notebook 58 7 Updated May 18, 2023

🐙 Guides, papers, lessons, notebooks and resources for prompt engineering, context engineering, RAG, and AI Agents.

MDX 68,963 7,350 Updated Dec 29, 2025

Adaptive Quantile Activation (AQUA): A learnable activation function that dynamically adapts to input distribution

1 Updated Dec 12, 2024
Next