Stars
Minimal reproduction of DeepSeek R1-Zero
verl: Volcano Engine Reinforcement Learning for LLMs
Cosmos-RL is a flexible and scalable Reinforcement Learning framework specialized for Physical AI applications.
cuTile is a programming model for writing parallel kernels for NVIDIA GPUs
💫 Toolkit to help you get started with Spec-Driven Development
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.
collection of benchmarks to measure basic GPU capabilities
CRS-自建Claude Code镜像,一站式开源中转服务,让 Claude、OpenAI、Gemini、Droid 订阅统一接入,支持拼车共享,更高效分摊成本,原生工具无缝使用。
Use Claude Code as the foundation for coding infrastructure, allowing you to decide how to interact with the model while enjoying updates from Anthropic.
Lichtblick is an integrated visualization and diagnosis tool for robotics, available in your browser or as a desktop app on Linux, Windows, and macOS.
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
🚀 「大模型」1小时从0训练26M参数的视觉多模态VLM!🌏 Train a 26M-parameter VLM from scratch in just 1 hours!
🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.
A PyTorch native platform for training generative AI models
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)
A framework for few-shot evaluation of language models.
Efficient Triton Kernels for LLM Training
A library for mechanistic interpretability of GPT-style language models
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.
Building blocks for foundation models.
Large Language Model Text Generation Inference