PanAndy

PanAndy PanAndy

LLM：RLHF alog and infra Quantitative Invest Deep Learning: time series data, image&video target detection, tracking and segmentation

42 followers · 5 following

alibaba-inc
Beijing, China

Achievements

Highlights

Stars

onlook-dev / onlook

The Cursor for Designers • An Open-Source AI-First Design tool • Visually build, style, and edit your React App with AI

TypeScript 23,906 1,775 Updated Dec 29, 2025

radixark / miles

Miles is an enterprise-facing reinforcement learning framework for large-scale MoE post-training and production workloads, forked from and co-evolving with slime.

Python 706 72 Updated Jan 13, 2026

BehiSecc / awesome-claude-skills

A curated list of Claude Skills.

3,805 286 Updated Jan 5, 2026

alibaba / ROCK

A construction kit for reinforcement learning environment management.

Python 298 30 Updated Jan 13, 2026

Fission-AI / OpenSpec

Spec-driven development (SDD) for AI coding assistants.

TypeScript 16,835 1,155 Updated Jan 11, 2026

ValueCell-ai / valuecell

ValueCell is a community-driven, multi-agent platform for financial applications.

Python 8,098 1,418 Updated Jan 10, 2026

wshobson / agents

Intelligent automation and multi-agent orchestration for Claude Code

C# 25,207 2,778 Updated Jan 9, 2026

alibaba / InferSim

A Lightweight LLM Inference Performance Simulator

Python 53 13 Updated Jan 4, 2026

histmeisah / ROLL

Forked from alibaba/ROLL

An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models

Python 2 Updated Dec 26, 2025

alibaba / RecIS

A unified architecture deep learning framework designed specifically for ultra-large-scale sparse models.

Python 292 17 Updated Jan 9, 2026

inclusionAI / AWorld

Build, evaluate and train General Multi-Agent Assistance with ease

Python 1,094 113 Updated Jan 13, 2026

langgenius / dify

Production-ready platform for agentic workflow development.

Python 125,699 19,556 Updated Jan 13, 2026

OpenPipe / ART

Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!

Python 8,140 646 Updated Jan 12, 2026

axon-rl / gem

A Gym for Agentic LLMs

Python 420 27 Updated Dec 31, 2025

sii-research / siiRL

siiRL: Shanghai Innovation Institute RL Framework for Advanced LLMs and Multi-Agent Systems

Python 328 25 Updated Jan 5, 2026

inclusionAI / AReaL

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

Python 3,386 270 Updated Jan 13, 2026

Farama-Foundation / Gymnasium

An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)

Python 11,092 1,242 Updated Jan 7, 2026

Hannibal046 / Awesome-LLM

Awesome-LLM: a curated list of Large Language Model

25,994 2,254 Updated Jul 31, 2025

ltzheng / SimpleTIR

End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Python 350 20 Updated Jan 12, 2026

alibaba / ROLL

An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models

Python 2,632 199 Updated Jan 13, 2026

ingydotnet / git-subrepo

Shell 3,508 281 Updated Dec 23, 2025

langfengQ / verl-agent

verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"

Python 1,390 119 Updated Jan 12, 2026

bobxwu / learning-from-rewards-llm-papers

A comrephensive collection of learning from rewards in the post-training and test-time scaling of LLMs, with a focus on both reward models and learning strategies across training, inference, and po…

60 2 Updated Jun 13, 2025

mll-lab-nu / RAGEN

RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.

Python 2,473 199 Updated Jan 7, 2026

wdndev / llm_interview_note

主要记录大语言大模型（LLMs）算法（应用）工程师相关的知识及面试题

HTML 11,789 1,192 Updated Apr 30, 2025

alimama-tech / AuctionNet

AuctionNet: A Novel Benchmark for Decision-Making in Large-Scale Games

Python 212 26 Updated Apr 25, 2025

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 18,266 3,014 Updated Jan 13, 2026

openpsi-project / ReaLHF

Super-Efficient RLHF Training of LLMs with Parameter Reallocation

Python 330 21 Updated Apr 24, 2025

Jhryu30 / AnomalyBERT

Python 160 26 Updated Jun 12, 2024

VarML / TACR

Python 43 18 Updated Jan 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PanAndy PanAndy

Achievements

Achievements

Highlights

Block or report PanAndy

Stars

onlook-dev / onlook

radixark / miles

BehiSecc / awesome-claude-skills

alibaba / ROCK

Fission-AI / OpenSpec

ValueCell-ai / valuecell

wshobson / agents

alibaba / InferSim

histmeisah / ROLL

alibaba / RecIS

inclusionAI / AWorld

langgenius / dify

OpenPipe / ART

axon-rl / gem

sii-research / siiRL

inclusionAI / AReaL

Farama-Foundation / Gymnasium

Hannibal046 / Awesome-LLM

ltzheng / SimpleTIR

alibaba / ROLL

ingydotnet / git-subrepo

langfengQ / verl-agent

bobxwu / learning-from-rewards-llm-papers

mll-lab-nu / RAGEN

wdndev / llm_interview_note

alimama-tech / AuctionNet

volcengine / verl

openpsi-project / ReaLHF

Jhryu30 / AnomalyBERT

VarML / TACR