Skip to content
View Cugtyt's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report Cugtyt

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Build high-quality LLM apps - from prototyping, testing to production deployment and monitoring.

Python 10,851 1,040 Updated Nov 3, 2025

The absolute trainer to light up AI agents.

Python 7,088 536 Updated Nov 5, 2025

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

Python 2,954 221 Updated Nov 5, 2025

本仓库包含对 Claude Code v1.0.33 进行逆向工程的完整研究和分析资料。包括对混淆源代码的深度技术分析、系统架构文档,以及重构 Claude Code agent 系统的实现蓝图。主要发现包括实时 Steering 机制、多 Agent 架构、智能上下文管理和工具执行管道。该项目为理解现代 AI agent 系统设计和实现提供技术参考。

JavaScript 11,125 2,927 Updated Jul 19, 2025

Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!

Python 7,780 599 Updated Oct 24, 2025

Kubernetes AI Toolchain Operator

Go 805 145 Updated Nov 5, 2025

🤗 smolagents: a barebones library for agents that think in code.

Python 23,787 2,098 Updated Oct 30, 2025

Playwright MCP server

TypeScript 22,845 1,837 Updated Nov 4, 2025

Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.

TypeScript 78,998 4,788 Updated Nov 5, 2025

An open protocol enabling communication and interoperability between opaque agentic applications.

TypeScript 20,559 2,086 Updated Nov 5, 2025

Democratizing Reinforcement Learning for LLMs

Jupyter Notebook 4,673 439 Updated Nov 4, 2025
Python 308 15 Updated May 24, 2025

Production-ready platform for agentic workflow development.

TypeScript 118,125 18,257 Updated Nov 5, 2025

🐫 CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org

Python 14,705 1,614 Updated Nov 5, 2025

🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation

Python 18,277 2,120 Updated Sep 24, 2025

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 3,767 295 Updated Jun 12, 2025

Train your AI self, amplify you, bridge the world

Python 14,548 1,110 Updated Sep 30, 2025

A course on aligning smol models.

Jupyter Notebook 6,482 2,297 Updated Nov 4, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 15,141 2,428 Updated Nov 5, 2025
Python 1,332 120 Updated Sep 12, 2025

This repository has code for fine-tuning LLMs with GRPO specifically for Rust Programming using cargo as feedback

Python 111 9 Updated Mar 8, 2025

RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.

Python 2,388 185 Updated Nov 4, 2025

A live stream development of RL tunning for LLM agents

Python 3,573 498 Updated Oct 8, 2025

A lightweight, powerful framework for multi-agent workflows

Python 17,114 2,817 Updated Nov 5, 2025

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 11,843 896 Updated Sep 30, 2025

DeepEP: an efficient expert-parallel communication library

Cuda 8,691 972 Updated Nov 5, 2025

A very simple GRPO implement for reproducing r1-like LLM thinking.

Python 1,427 109 Updated Aug 5, 2025

Simple RL training for reasoning

Python 3,782 279 Updated Aug 3, 2025

Train transformer language models with reinforcement learning.

Python 16,171 2,277 Updated Nov 6, 2025

Fully open reproduction of DeepSeek-R1

Python 25,614 2,401 Updated Sep 8, 2025
Next