Fast and exact implementation of the C++ from_chars functions for number types: 4x to 10x faster than strtod, part of GCC 12, MySQL, Chromium, Redis and WebKit/Safari

C++ 1,917 167 Updated Nov 3, 2025

Ledzy / StreamBP

Official code of "StreamBP: Memory-Efficient Exact Backpropagation for Long Sequence Training of LLMs".

Python 74 5 Updated Jun 23, 2025

locuslab / torchdeq

Modern Fixed Point Systems using Pytorch

Python 122 13 Updated Oct 31, 2023

thinking-machines-lab / batch_invariant_ops

Python 903 69 Updated Nov 4, 2025

stas00 / the-art-of-debugging

The Art of Debugging

Python 1,140 55 Updated Nov 17, 2025

ML-GSAI / Diffusion-LLM-Papers

A Collection of Papers on Diffusion Language Models

145 6 Updated Sep 15, 2025

SakanaAI / continuous-thought-machines

Continuous Thought Machines, because thought takes time and reasoning is a process.

Python 1,397 205 Updated Oct 14, 2025

facebookresearch / PhysicsLM4

Physics of Language Models, Part 4

HTML 260 13 Updated Jul 29, 2025

cfregly / ai-performance-engineering

Python 557 69 Updated Nov 17, 2025

huggingface / hub-docs

Docs of the Hugging Face Hub

Handlebars 471 378 Updated Nov 17, 2025

tile-ai / tilelang

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 3,941 318 Updated Nov 17, 2025

chengemily1 / id-llm-abstraction

Code for ICLR 2025 paper "Emergence of a High-Dimensional Abstraction Phase in Language Transformers"

Python 3 1 Updated Jan 23, 2025

facebookresearch / coconut

Training Large Language Model to Reason in a Continuous Latent Space

Python 1,338 141 Updated Aug 12, 2025

arogozhnikov / einops

Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)

Python 9,274 389 Updated Aug 12, 2025

ucrparlay / Pkd-tree

[SIGMOD' 25] A fast parallel kd-tree implementation

C++ 85 5 Updated Nov 16, 2025

vllm-project / aibrix

Cost-efficient and pluggable Infrastructure components for GenAI inference

Go 4,412 485 Updated Nov 17, 2025

zhaochenyang20 / Awesome-ML-SYS-Tutorial

My learning notes/codes for ML SYS.

Python 4,186 254 Updated Nov 17, 2025

junfanz1 / Software-Engineer-Coding-Interviews

Data Structure Algorithms, (GenAI/ML) System Design, Machine Learning, DevOps coding interview practices

516 135 Updated Oct 7, 2025

ndif-team / nnsight

The nnsight package enables interpreting and manipulating the internals of deep learned models.

Jupyter Notebook 697 63 Updated Nov 14, 2025

Joshua-Ren / Learning_dynamics_LLM

Jupyter Notebook 182 9 Updated May 16, 2025

lablup / backend.ai

Backend.AI is a streamlined, container-based computing cluster platform that hosts popular computing/ML frameworks and diverse programming languages, with pluggable heterogeneous accelerator suppor…

Python 588 163 Updated Nov 17, 2025

SakanaAI / TAID

Official implementation of "TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models"

Python 118 9 Updated Oct 6, 2025

qiuzh20 / gated_attention

The official implementation for [NeurIPS2025 Oral] Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

Jupyter Notebook 106 6 Updated Sep 19, 2025

rkinas / triton-resources

A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.

Python 431 26 Updated Mar 10, 2025

Kim Jae-Jin (김재진) kkuoo7

Lists (12)

📔 Awesome Lists

🛻 Database System Design

🏗️ Efficient AI Startups

📟 Hardware Design

⭐ Inference Engine

🏙️ Intel Gaudi

🥅 Job Interview

🍇 LLVM

🗡️ ML Compiler

🖥️ Operating System Design

🎲 Pandas

❤️ UG Research Project

Stars