🎯
Focusing
Master's student @ THU
-
Tsinghua University
- Qingdao
-
00:06
(UTC +08:00) - https://ryanliu112.github.io
- https://scholar.google.com/citations?user=LiIfGakAAAAJ
Highlights
- Pro
Pinned Loading
-
TsinghuaC3I/Awesome-RL-for-LRMs
TsinghuaC3I/Awesome-RL-for-LRMs PublicA Survey of Reinforcement Learning for Large Reasoning Models
-
TsinghuaC3I/MARTI
TsinghuaC3I/MARTI PublicA Framework for LLM-based Multi-Agent Reinforced Training and Inference
-
compute-optimal-tts
compute-optimal-tts PublicOfficial codebase for "Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling".
-
Awesome-Process-Reward-Models
Awesome-Process-Reward-Models PublicA comprehensive collection of process reward models.
-
wizard-III/ArcherCodeR
wizard-III/ArcherCodeR PublicArcherCodeR is an open-source initiative enhancing code reasoning in large language models through scalable, rule-governed reinforcement learning.
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.