Skip to content
View junwucs's full-sized avatar

Block or report junwucs

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

A RLHF Framework Enhances OpenRLHF.

Python 2 Updated Mar 19, 2025

Recipes to train the self-rewarding reasoning LLMs.

Python 228 11 Updated Mar 2, 2025

On Memorization of Large Language Models in Logical Reasoning

Python 72 6 Updated Mar 29, 2025

A unified suite for generating elite reasoning problems and training high-performance LLMs, including pioneering attention-free architectures

Python 130 12 Updated Oct 26, 2025
1 Updated Mar 7, 2025

A flexible and efficient training framework for large-scale alignment tasks

Python 447 39 Updated Oct 23, 2025

[ICML 2025 Oral] CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction

Python 566 32 Updated May 6, 2025

[NeurIPS'25] Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"

Python 653 56 Updated Mar 16, 2025

Ling is a MoE LLM provided and open-sourced by InclusionAI.

Python 238 20 Updated May 14, 2025

Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.

Python 836 54 Updated May 14, 2025

✨ A synthetic dataset generation framework that produces diverse coding questions and verifiable solutions - all in one framwork

Python 301 18 Updated Sep 6, 2025

Latest Advances on System-2 Reasoning

Python 1,303 75 Updated Jun 8, 2025

A bidirectional pipeline parallelism algorithm for computation-communication overlap in DeepSeek V3/R1 training.

Python 2,901 312 Updated Mar 10, 2025

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

Python 3,378 270 Updated Jan 11, 2026

SIFT: Grounding LLM Reasoning in Contexts via Stickers

Python 57 3 Updated Mar 6, 2025

AtCoder Library

C++ 2,238 261 Updated May 1, 2025

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 4,407 337 Updated Jan 5, 2026

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 11,964 926 Updated Dec 15, 2025

minimal-cost for training 0.5B R1-Zero

Python 799 102 Updated May 14, 2025

RL algorithm: Advantage induced policy alignment

Python 66 6 Updated Aug 11, 2023

Reproduce R1 Zero on Logic Puzzle

Python 2,431 165 Updated Mar 20, 2025

ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)

Python 687 51 Updated Jan 20, 2025

Democratizing Reinforcement Learning for LLMs

Python 4,963 477 Updated Jan 10, 2026

Official Repo for Open-Reasoner-Zero

Python 2,087 118 Updated Jun 2, 2025

Application Kernel for Containers

Go 17,508 1,484 Updated Jan 11, 2026

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

LLVM 36,375 15,737 Updated Jan 11, 2026

Slurm: A Highly Scalable Workload Manager

C 3,630 788 Updated Jan 9, 2026

An open-source Java library for Constraint Programming

Java 747 152 Updated Jan 10, 2026
Python 4,281 464 Updated Jul 31, 2025
Next