junwucs

Xiong Jun Wu(熊君武) junwucs

Xiong Jun Wu @ Ant Group, Beijing

54 followers · 573 following

Universe
https://junwuxgi.github.io/

Starred repositories

jovany-wang / OpenRLHF-X

Forked from OpenRLHF/OpenRLHF

A RLHF Framework Enhances OpenRLHF.

Python 2 Updated Mar 19, 2025

RLHFlow / Self-rewarding-reasoning-LLM

Recipes to train the self-rewarding reasoning LLMs.

Python 228 11 Updated Mar 2, 2025

AlphaPav / mem-kk-logic

On Memorization of Large Language Models in Logical Reasoning

Python 72 6 Updated Mar 29, 2025

inclusionAI / PromptCoT

A unified suite for generating elite reasoning problems and training high-performance LLMs, including pioneering attention-free architectures

Python 130 12 Updated Oct 26, 2025

junwucs / resources

1 Updated Mar 7, 2025

alibaba / ChatLearn

A flexible and efficient training framework for large-scale alignment tasks

Python 447 39 Updated Oct 23, 2025

hkust-nlp / CodeIO

[ICML 2025 Oral] CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction

Python 566 32 Updated May 6, 2025

facebookresearch / swe-rl

[NeurIPS'25] Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"

Python 653 56 Updated Mar 16, 2025

inclusionAI / Ling

Ling is a MoE LLM provided and open-sourced by InclusionAI.

Python 238 20 Updated May 14, 2025

Zhou-Zoey / RMB-Reward-Model-Benchmark

Python 47 4 Updated Mar 25, 2025

TideDra / lmm-r1

Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.

Python 836 54 Updated May 14, 2025

KodCode-AI / kodcode

✨ A synthetic dataset generation framework that produces diverse coding questions and verifiable solutions - all in one framwork

Python 301 18 Updated Sep 6, 2025

zzli2022 / Awesome-System2-Reasoning-LLM

Latest Advances on System-2 Reasoning

Python 1,303 75 Updated Jun 8, 2025

deepseek-ai / DualPipe

A bidirectional pipeline parallelism algorithm for computation-communication overlap in DeepSeek V3/R1 training.

Python 2,901 312 Updated Mar 10, 2025

inclusionAI / AReaL

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

Python 3,378 270 Updated Jan 11, 2026

zhijie-group / SIFT

SIFT: Grounding LLM Reasoning in Contexts via Stickers

Python 57 3 Updated Mar 6, 2025

atcoder / ac-library

AtCoder Library

C++ 2,238 261 Updated May 1, 2025

hiyouga / EasyR1

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 4,407 337 Updated Jan 5, 2026

deepseek-ai / FlashMLA

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 11,964 926 Updated Dec 15, 2025

dhcode-cpp / X-R1

minimal-cost for training 0.5B R1-Zero

Python 799 102 Updated May 14, 2025

microsoft / RLHF-APA

RL algorithm: Advantage induced policy alignment

Python 66 6 Updated Aug 11, 2023

Unakar / Logic-RL

Reproduce R1 Zero on Logic Puzzle

Python 2,431 165 Updated Mar 20, 2025

THUDM / ReST-MCTS

ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)

Python 687 51 Updated Jan 20, 2025

rllm-org / rllm

Democratizing Reinforcement Learning for LLMs

Python 4,963 477 Updated Jan 10, 2026

Open-Reasoner-Zero / Open-Reasoner-Zero

Official Repo for Open-Reasoner-Zero

Python 2,087 118 Updated Jun 2, 2025

google / gvisor

Application Kernel for Containers

Go 17,508 1,484 Updated Jan 11, 2026

llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

LLVM 36,375 15,737 Updated Jan 11, 2026

SchedMD / slurm

Slurm: A Highly Scalable Workload Manager

C 3,630 788 Updated Jan 9, 2026

chocoteam / choco-solver

An open-source Java library for Constraint Programming

Java 747 152 Updated Jan 10, 2026

openai / simple-evals

Python 4,281 464 Updated Jul 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly