Skip to content
View ghh1125's full-sized avatar
💭
I may be slow to respond.
💭
I may be slow to respond.

Highlights

  • Pro

Block or report ghh1125

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."

MATLAB 14,217 1,338 Updated Oct 28, 2025

awesome synthetic (text) datasets

Jupyter Notebook 320 16 Updated Jan 8, 2026

A reading list on LLM based Synthetic Data Generation 🔥

1,505 91 Updated Jun 5, 2025

A Curated List of Awesome Works in World Modeling, Aiming to Serve as a One-stop Resource for Researchers, Practitioners, and Enthusiasts Interested in World Modeling.

1,683 73 Updated Jan 11, 2026

Collect some World Models for Autonomous Driving (and Robotic, etc.) papers.

1,749 70 Updated Dec 22, 2025

Introduction about AWESOME_ENTROPY+LRM_PAPERS

29 1 Updated Dec 16, 2025

Awesome Large Reasoning Model(LRM) Safety.This repository is used to collect security-related research on large reasoning models such as DeepSeek-R1 and OpenAI o1, which are currently very popular.

Python 79 6 Updated Jan 12, 2026

Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains papers, codes, datasets, evaluations, and analyses.

257 10 Updated Aug 13, 2025

😎 A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, Agent, and Beyond

330 12 Updated Dec 31, 2025

verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"

Python 1,387 118 Updated Jan 12, 2026

verl: Volcano Engine Reinforcement Learning for LLMs

Python 18,246 3,005 Updated Jan 12, 2026

复现大模型相关算法及一些学习记录

Python 2,829 385 Updated Dec 27, 2025

An Open-source RL System from ByteDance Seed and Tsinghua AIR

Python 1,702 76 Updated May 11, 2025

🌐 Permanent Hosting Site: http://ai-paper-finder.info/ 🌐 Hugging Face Hosting: https://huggingface.co/spaces/wenhanacademia/ai-paper-finder

Jupyter Notebook 259 13 Updated Jan 3, 2026

AgentScope: Agent-Oriented Programming for Building LLM Applications

Python 15,391 1,309 Updated Jan 12, 2026

Continuously updated paper list on advancements in Data Agents. Companion repo to our paper "A Survey of Data Agents: Emerging Paradigm or Overstated Hype?"

Python 344 18 Updated Dec 23, 2025

Official Repo of "RobustFlow: Towards Robust Agentic Workflow Generation"

Python 231 Updated Oct 19, 2025

[Survey] A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

1,649 112 Updated Oct 11, 2025

🚀 EvoAgentX: Building a Self-Evolving Ecosystem of AI Agents

Python 2,453 195 Updated Jan 7, 2026

Democratizing Reinforcement Learning for LLMs

Python 4,967 478 Updated Jan 10, 2026

A Survey of Reinforcement Learning for Large Reasoning Models

TeX 2,240 123 Updated Nov 9, 2025

Official Repo of "Code2MCP: Transforming Code Repositories into MCP Services", Scaling Environments for Agents Workshop @ NeurIPS 2025

Python 102 11 Updated Nov 4, 2025

12 Lessons to Get Started Building AI Agents

Jupyter Notebook 48,539 16,841 Updated Jan 12, 2026
HTML 14 4 Updated Oct 9, 2025

RepoMaster: The open-source AI agent that masters GitHub. It turns any code repository into a powerful tool, achieving a new level of autonomous task-solving. An open alternative to Claude-Code.

Python 469 61 Updated Nov 5, 2025

SciToolAgent: A Knowledge Graph-Driven Scientific Agent for Multi-Tool Integration

Python 365 54 Updated Aug 26, 2025

(NeurIPS 2024) AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning

Python 233 26 Updated Jun 10, 2025

The official repository for "Rongsheng Wang's Arxiv Template"

TeX 54 2 Updated May 7, 2025

Implementation of Toolformer: Language Models Can Teach Themselves to Use Tools

Jupyter Notebook 144 15 Updated Apr 5, 2023
Next