Skip to content
View ydup's full-sized avatar
🔥
🔥
  • Meituan
  • Beijing

Block or report ydup

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

A series of technical report on Slow Thinking with LLM

Python 743 41 Updated Aug 13, 2025

https://hrl.boyuai.com/

Jupyter Notebook 4,144 761 Updated Nov 22, 2022

Seed-Coder is a family of lightweight open-source code LLMs comprising base, instruct and reasoning models, developed by ByteDance Seed.

605 45 Updated Jun 6, 2025

PPO x Family DRL Tutorial Course(决策智能入门级公开课:8节课帮你盘清算法理论,理顺代码逻辑,玩转决策AI应用实践 )

Python 2,425 204 Updated Mar 13, 2025

This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."

MATLAB 12,760 1,218 Updated Oct 28, 2025

Must-read Papers on LLM Agents.

2,760 162 Updated Oct 24, 2025

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 78,195 11,552 Updated Nov 6, 2025

🔍 Search-o1: Agentic Search-Enhanced Large Reasoning Models [EMNLP 2025]

Python 1,082 95 Updated Aug 21, 2025

Secrets of RLHF in Large Language Models Part I: PPO

Python 1,402 104 Updated Mar 3, 2024

Latest Advances on System-2 Reasoning

Python 1,265 73 Updated Jun 8, 2025

Exploring Applications of GRPO

Python 248 34 Updated Aug 25, 2025

Fully open reproduction of DeepSeek-R1

Python 25,618 2,401 Updated Sep 8, 2025

Integrate the DeepSeek API into popular softwares

34,375 3,848 Updated Sep 25, 2025

推荐/广告/搜索领域工业界经典以及最前沿论文集合。A collection of industry classics and cutting-edge papers in the field of recommendation/advertising/search.

Python 1,978 256 Updated Oct 9, 2025

LLM Tuning with PEFT (SFT+RM+PPO+DPO with LoRA)

Python 437 22 Updated Oct 11, 2023

计算广告机制策略相关材料整理(A collection of research and application papers about Strategy in Internet advertising.)

177 22 Updated Feb 18, 2024

The official implementation of Self-Play Fine-Tuning (SPIN)

Python 1,213 102 Updated May 8, 2024

搜索、推荐、广告、用增等工业界实践文章收集(来源:知乎、Datafuntalk、技术公众号)

HTML 3,974 446 Updated Nov 8, 2025

All-in-One: Text Embedding, Retrieval, Reranking and RAG in Transformers

Python 71 13 Updated Aug 10, 2025

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Python 25,285 1,764 Updated Oct 13, 2025

Free ChatGPT Site List 这儿为你准备了众多免费好用的ChatGPT镜像站点

17,059 1,449 Updated Oct 27, 2025

https://acl2023-retrieval-lm.github.io/

JavaScript 158 15 Updated Oct 18, 2023

Official Code for Stable Cascade

Jupyter Notebook 6,582 526 Updated Jul 25, 2024

Empower Large Language Models (LLM) using Knowledge Graph based Retrieval-Augmented Generation (KG-RAG) for knowledge intensive tasks

Jupyter Notebook 906 108 Updated Nov 9, 2024

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 6,348 470 Updated Aug 7, 2024

支持中英文双语视觉-文本对话的开源可商用多模态模型。

Python 376 32 Updated Sep 23, 2023

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 23,919 2,659 Updated Aug 12, 2024

✨✨Latest Advances on Multimodal Large Language Models

16,646 1,073 Updated Nov 6, 2025

unified embedding model

Python 871 71 Updated Sep 1, 2023

Official repo for consistency models.

Python 6,433 434 Updated Mar 22, 2024
Next