-
lucid
Stars
Unofficial implementation of the toy example in JiT https://arxiv.org/abs/2511.13720
MoBA: Mixture of Block Attention for Long-Context LLMs
Code for "What really matters in matrix-whitening optimizers?"
Let's train vision transformers (ViT) for cifar 10 / cifar 100!
An efficient implementation of the NSA (Native Sparse Attention) kernel
Pytorch implementation of MeanFlow on ImageNet and CIFAR10
[CVPR 2025] "DiC: Rethinking Conv3x3 Designs in Diffusion Models", a performant & speedy Conv3x3 diffusion model.
[NeurIPS 2025 Spotlight] DiCo: Revitalizing ConvNets for Scalable and Efficient Diffusion Modeling
Fast, Numerically Stable, and Auto-Differentiable Spectral Clipping via Newton-Schulz Iteration
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
Official implementation of the paper: "ZClip: Adaptive Spike Mitigation for LLM Pre-Training".
F Lite is a 10B parameter diffusion model created by Freepik and Fal, trained exclusively on copyright-safe and SFW content.
Schedule-Free Optimization in PyTorch
Official implementation of Inductive Moment Matching
An End-To-End, Lightweight and Flexible Platform for Game Research
⚡ Flash Diffusion ⚡: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation (AAAI 2025 Oral)
research impl of Native Sparse Attention (2502.11089)
Lightweight package that tracks and summarizes code changes using LLMs (Large Language Models)
Lightweight package that tracks and summarizes code changes using LLMs (Large Language Models)
Training Large Language Model to Reason in a Continuous Latent Space
Library for reading and processing ML training data.