Skip to content
View lvao123's full-sized avatar

Highlights

  • Pro

Block or report lvao123

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Agentic RAG R1 Framework via Reinforcement Learning

Python 367 43 Updated Jan 13, 2026

Train DeepSearch Agent.

Python 2 Updated Oct 28, 2025

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.

Python 50,749 4,187 Updated Jan 15, 2026

ThinkDepth.ai Deep Research

Jupyter Notebook 127 35 Updated Jan 5, 2026

DeepAnalyze is the first agentic LLM for autonomous data science. 🎈你的AI数据分析师,自动分析大量数据,一键生成专业分析报告!

Python 3,489 515 Updated Jan 14, 2026

Autonomously train research-agent LLMs on custom data using reinforcement learning and self-verification.

Jupyter Notebook 680 61 Updated Mar 22, 2025

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 3,822 323 Updated Nov 13, 2025

个人构建MoE大模型:从预训练到DPO的完整实践

Python 2,244 164 Updated Dec 30, 2025

“AI-Compass”将为社区指引在 AI 技术海洋中航行的方向,无论你是初学者还是进阶开发者,都能在这里找到通往 AI 各大方向的路径。旨在帮助开发者系统性地了解 AI 的核心概念、主流技术、前沿趋势,并通过实践掌握从理论到落地的全过程。

508 63 Updated Dec 11, 2025

Reproducing R1 for Code with Reliable Rewards

Python 280 16 Updated May 5, 2025

This is a continuously updated handbook for readers to easily track the latest Text-to-SQL techniques in the literature and provide practical guidance for researchers and practitioners.

Python 1,251 74 Updated Jan 13, 2026

Agentar-Scale-SQL is a novel framework that leverages scalable computation to significantly improve Text-to-SQL performance.

Python 324 32 Updated Dec 16, 2025

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

Python 3,398 273 Updated Jan 16, 2026

Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning" by Zhiheng Xi et al.

Python 559 60 Updated Sep 11, 2025

Train your Agent model via our easy and efficient framework

Python 1,687 160 Updated Dec 5, 2025

Music-Aligned Holistic 3D Dance Generation via Hierarchical Motion Modeling [ICCV 2025] Official PyTorch implementation

Python 26 1 Updated Nov 11, 2025

SkyRL: A Modular Full-stack RL Library for LLMs

Python 1,454 222 Updated Jan 15, 2026

Agent that converts natural language queries into SQL and provides response and query created

Jupyter Notebook 54 55 Updated May 28, 2025

personal chatgpt

Jupyter Notebook 404 72 Updated Jan 11, 2026

The latest research progress of Contrastive Learning(CL), Data Augmentation(DA) and Self-Supervised Learning(SSL) in Recommender Systems

427 36 Updated Sep 2, 2025

https://layer6ai-labs.github.io/xpool/

Python 133 10 Updated Jul 1, 2023

Code for ICMR 2024 paper "BeatDance: A Beat-Based Model-Agnostic Contrastive Learning Framework for Music-Dance Retrieval"

Python 8 2 Updated Jun 15, 2024

Official Pytorch implementation of the paper "MotionCLIP: Exposing Human Motion Generation to CLIP Space"

Python 489 47 Updated Dec 18, 2023

[ICCV 2025] This repo is the official implementation of "Music Grounding by Short Video"

Python 26 2 Updated Sep 9, 2025

This is the official code implementation for 《M2Beats 2.0: When Motion Meets Beats in Short-form Videos Twice》. More details will be released once the paper is published!

Python 5 1 Updated Apr 15, 2025

LLM training in simple, raw C/CUDA

Cuda 28,610 3,356 Updated Jun 26, 2025

How can we build a true AI agent? Like Claude Code.

Python 14,326 3,329 Updated Jan 7, 2026

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 83,120 12,496 Updated Jan 14, 2026

Single Motion Diffusion Model

Python 409 23 Updated Mar 26, 2025
Next