Skip to content
View PanAndy's full-sized avatar
  • alibaba-inc
  • Beijing, China

Highlights

  • Pro

Block or report PanAndy

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The Cursor for Designers • An Open-Source AI-First Design tool • Visually build, style, and edit your React App with AI

TypeScript 23,906 1,775 Updated Dec 29, 2025

Miles is an enterprise-facing reinforcement learning framework for large-scale MoE post-training and production workloads, forked from and co-evolving with slime.

Python 706 72 Updated Jan 13, 2026

A curated list of Claude Skills.

3,805 286 Updated Jan 5, 2026

A construction kit for reinforcement learning environment management.

Python 298 30 Updated Jan 13, 2026

Spec-driven development (SDD) for AI coding assistants.

TypeScript 16,835 1,155 Updated Jan 11, 2026

ValueCell is a community-driven, multi-agent platform for financial applications.

Python 8,098 1,418 Updated Jan 10, 2026

Intelligent automation and multi-agent orchestration for Claude Code

C# 25,207 2,778 Updated Jan 9, 2026

A Lightweight LLM Inference Performance Simulator

Python 53 13 Updated Jan 4, 2026

An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models

Python 2 Updated Dec 26, 2025

A unified architecture deep learning framework designed specifically for ultra-large-scale sparse models.

Python 292 17 Updated Jan 9, 2026

Build, evaluate and train General Multi-Agent Assistance with ease

Python 1,094 113 Updated Jan 13, 2026

Production-ready platform for agentic workflow development.

Python 125,699 19,556 Updated Jan 13, 2026

Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!

Python 8,140 646 Updated Jan 12, 2026

A Gym for Agentic LLMs

Python 420 27 Updated Dec 31, 2025

siiRL: Shanghai Innovation Institute RL Framework for Advanced LLMs and Multi-Agent Systems

Python 328 25 Updated Jan 5, 2026

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

Python 3,386 270 Updated Jan 13, 2026

An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)

Python 11,092 1,242 Updated Jan 7, 2026

Awesome-LLM: a curated list of Large Language Model

25,994 2,254 Updated Jul 31, 2025

End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Python 350 20 Updated Jan 12, 2026

An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models

Python 2,632 199 Updated Jan 13, 2026

verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"

Python 1,390 119 Updated Jan 12, 2026

A comrephensive collection of learning from rewards in the post-training and test-time scaling of LLMs, with a focus on both reward models and learning strategies across training, inference, and po…

60 2 Updated Jun 13, 2025

RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.

Python 2,473 199 Updated Jan 7, 2026

主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题

HTML 11,789 1,192 Updated Apr 30, 2025

AuctionNet: A Novel Benchmark for Decision-Making in Large-Scale Games

Python 212 26 Updated Apr 25, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 18,266 3,014 Updated Jan 13, 2026

Super-Efficient RLHF Training of LLMs with Parameter Reallocation

Python 330 21 Updated Apr 24, 2025
Python 160 26 Updated Jun 12, 2024
Python 43 18 Updated Jan 24, 2023
Next