Skip to content
View AHEADer's full-sized avatar

Block or report AHEADer

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

💫 Toolkit to help you get started with Spec-Driven Development

Python 61,836 5,374 Updated Dec 4, 2025

Recipes for reproducing training and serving benchmarks for large machine learning models using GPUs on Google Cloud.

Python 109 55 Updated Jan 10, 2026

Debug the intermediate outputs of two models.

HTML 2 Updated Aug 8, 2025

An Adaptive Pencil Decomposition Library for NVIDIA GPUs

C++ 74 11 Updated Dec 2, 2025

System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge

Go 2,750 414 Updated Jan 12, 2026

Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.

Python 12,913 1,206 Updated Sep 26, 2025

CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning

Python 277 79 Updated Nov 3, 2025

A configuration framework that enhances Claude Code with specialized commands, cognitive personas, and development methodologies.

Python 20,017 1,733 Updated Jan 10, 2026

Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!

Python 8,133 645 Updated Jan 10, 2026

MOSS-TTSD is a spoken dialogue generation model that enables expressive dialogue speech synthesis in both Chinese and English, supporting zero-shot multi-speaker voice cloning, and long-form speech…

Python 1,075 95 Updated Dec 8, 2025

SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]

Python 18,227 1,953 Updated Dec 29, 2025

A generative speech model for daily dialogue.

Python 38,508 4,186 Updated Dec 3, 2025

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

Python 12,604 2,003 Updated Jan 12, 2026

kernels, of the mega variety

Python 644 35 Updated Sep 28, 2025

FULL Augment Code, Claude Code, Cluely, CodeBuddy, Comet, Cursor, Devin AI, Junie, Kiro, Leap.new, Lovable, Manus, NotionAI, Orchids.app, Perplexity, Poke, Qoder, Replit, Same.dev, Trae, Traycer AI…

107,584 28,276 Updated Jan 8, 2026

A prefill & decode disaggregated LLM serving framework with shared GPU memory and fine-grained compute isolation.

Python 123 15 Updated Dec 25, 2025

Distributed Compiler based on Triton for Parallel Systems

Python 1,311 116 Updated Dec 27, 2025

Invert scroll direction for physical scroll wheels while maintaining "Natural" scrolling for trackpads on MacOS

Swift 3,887 84 Updated Dec 2, 2025

A collection of reproducible inference engine benchmarks

Shell 38 1 Updated Apr 22, 2025

Perplexity GPU Kernels

C++ 552 75 Updated Nov 7, 2025

ByteCheckpoint: An Unified Checkpointing Library for LFMs

Python 260 17 Updated Dec 8, 2025

A Datacenter Scale Distributed Inference Serving Framework

Rust 5,761 774 Updated Jan 12, 2026

The complete stack for AI Engineers: framework, runtime and control plane.

Python 36,811 4,873 Updated Jan 12, 2026

A fast communication-overlapping library for tensor/expert parallelism on GPUs.

C++ 1,220 85 Updated Aug 28, 2025

📄 Configuration files that enhance Cursor AI editor experience with custom rules and behaviors

MDX 36,930 3,132 Updated Oct 24, 2025

🌐 Make websites accessible for AI agents. Automate tasks online with ease.

Python 75,291 8,986 Updated Jan 11, 2026

Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, 20+ clouds, or on-prem).

Python 9,197 902 Updated Jan 12, 2026

Tile primitives for speedy kernels

Cuda 3,056 224 Updated Jan 12, 2026
Next