- All languages
- Assembly
- C
- C#
- C++
- CSS
- Clojure
- CoffeeScript
- Common Lisp
- Coq
- Cuda
- Cython
- Dockerfile
- EJS
- Earthly
- Fortran
- Go
- HTML
- Haskell
- Java
- JavaScript
- Julia
- Jupyter Notebook
- Kotlin
- LLVM
- Lean
- Lua
- MATLAB
- MDX
- MLIR
- Makefile
- Markdown
- Mojo
- OCaml
- Objective-C
- Objective-C++
- OpenEdge ABL
- PDDL
- PHP
- Perl
- Perl 6
- Prolog
- PureBasic
- Python
- R
- Ruby
- Rust
- SAS
- SCSS
- Scala
- Shell
- Swift
- TeX
- Terra
- TypeScript
- Vim Script
- Visual Basic
- XSLT
Starred repositories
jovany-wang / OpenRLHF-X
Forked from OpenRLHF/OpenRLHFA RLHF Framework Enhances OpenRLHF.
Recipes to train the self-rewarding reasoning LLMs.
On Memorization of Large Language Models in Logical Reasoning
A unified suite for generating elite reasoning problems and training high-performance LLMs, including pioneering attention-free architectures
A flexible and efficient training framework for large-scale alignment tasks
[ICML 2025 Oral] CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction
[NeurIPS'25] Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"
Ling is a MoE LLM provided and open-sourced by InclusionAI.
Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.
✨ A synthetic dataset generation framework that produces diverse coding questions and verifiable solutions - all in one framwork
Latest Advances on System-2 Reasoning
A bidirectional pipeline parallelism algorithm for computation-communication overlap in DeepSeek V3/R1 training.
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.
SIFT: Grounding LLM Reasoning in Contexts via Stickers
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
FlashMLA: Efficient Multi-head Latent Attention Kernels
RL algorithm: Advantage induced policy alignment
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)
Official Repo for Open-Reasoner-Zero
The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
An open-source Java library for Constraint Programming