Starred repositories
This project explores transformer improvements through Super-Transformers (scalable softmax, positional encoding ablation) and MiniDeepSeek (multi-latent attention, KV cache, weight absorption) to …
Unofficial Scalable-Softmax Is Superior for Attention
Official code implementation of Context Cascade Compression: Exploring the Upper Limits of Text Compression
Unofficial Implementation of Selective Attention Transformer
Implementation of paper Data Engineering for Scaling Language Models to 128K Context
Implementation of "Efficient Training of Language Models to Fill in the Middle"
A Reproduction of GDM's Nested Learning Paper
Sequential Diffusion Language Model (SDLM) enhances pre-trained autoregressive language models by adaptively determining generation length and maintaining KV-cache compatibility, achieving high eff…
A community driven list of open source alternatives to proprietary software and applications.
A unified library for building, evaluating, and storing speculative decoding algorithms for LLM inference in vLLM
DLER: Doing Length pEnalty Right - Incentivizing More Intelligence per Token via Reinforcement Learning
CUDA Python: Performance meets Productivity
🌐 The open-source Agentic browser; privacy-first alternative to ChatGPT Atlas, Perplexity Comet, Dia.
Official repository for Adaptive Parallel Decoding (APD).
(best/better) practices of megatron on veRL and tuning guide
A Google Apps Script for syncing ICS/ICAL files faster than the current Google Calendar speed
Generate a timeline of your day, automatically
Practical productivity tools for Claude Code, Codex-CLI, and similar CLI coding agents.
An efficient implementation of the NSA (Native Sparse Attention) kernel
Render any git repo into a single static HTML page for humans or LLMs
This is a highlight select words plugin for Visual Studio Code.It's very useful when you are reading code.
The Cursor for Designers • An Open-Source AI-First Design tool • Visually build, style, and edit your React App with AI
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
Official Repo for Open-Reasoner-Zero
Production-ready platform for agentic workflow development.
verl: Volcano Engine Reinforcement Learning for LLMs