Starred repositories
Build high-quality LLM apps - from prototyping, testing to production deployment and monitoring.
The absolute trainer to light up AI agents.
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.
本仓库包含对 Claude Code v1.0.33 进行逆向工程的完整研究和分析资料。包括对混淆源代码的深度技术分析、系统架构文档,以及重构 Claude Code agent 系统的实现蓝图。主要发现包括实时 Steering 机制、多 Agent 架构、智能上下文管理和工具执行管道。该项目为理解现代 AI agent 系统设计和实现提供技术参考。
Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!
🤗 smolagents: a barebones library for agents that think in code.
Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.
An open protocol enabling communication and interoperability between opaque agentic applications.
Democratizing Reinforcement Learning for LLMs
Production-ready platform for agentic workflow development.
🐫 CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org
🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation
Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.
Train your AI self, amplify you, bridge the world
A course on aligning smol models.
verl: Volcano Engine Reinforcement Learning for LLMs
This repository has code for fine-tuning LLMs with GRPO specifically for Rust Programming using cargo as feedback
RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.
A live stream development of RL tunning for LLM agents
A lightweight, powerful framework for multi-agent workflows
FlashMLA: Efficient Multi-head Latent Attention Kernels
DeepEP: an efficient expert-parallel communication library
A very simple GRPO implement for reproducing r1-like LLM thinking.
Train transformer language models with reinforcement learning.
Fully open reproduction of DeepSeek-R1