Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…

Python 11,325 1,011 Updated Nov 28, 2025

gpu-mode / lectures

Material for gpu-mode lectures

Jupyter Notebook 5,358 538 Updated Nov 21, 2025

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 16,819 2,677 Updated Nov 27, 2025

matsuolab / llm_bridge_prod

Python 32 14 Updated Aug 21, 2025

huggingface / Math-Verify

Python 1,014 48 Updated Jul 2, 2025

Damin3927 / llm2025compet

Team Neko (the preliminary)

Python 7 Updated Oct 23, 2025

kaityo256 / sevendayshpc

一週間でなれる！スパコンプログラマ

HTML 723 29 Updated Apr 10, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 25,693 2,402 Updated Nov 24, 2025

allenai / IFBench

Python 92 12 Updated Nov 27, 2025

AmeerArsala / LLM-Data-Cleaner

Data Cleaning using LLMs

Jupyter Notebook 8 Updated Mar 17, 2024

huggingface / trl

Train transformer language models with reinforcement learning.

Python 16,456 2,322 Updated Nov 28, 2025

huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.

Python 31,799 6,549 Updated Nov 28, 2025

pallets / jinja

A very fast and expressive template engine.

Python 11,302 1,684 Updated Jun 14, 2025

unslothai / unsloth

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.

Python 48,781 4,015 Updated Nov 28, 2025

hamishivi / EasyLM

Forked from young-geng/EasyLM

Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.

Python 76 16 Updated Aug 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ShotaKaji5207

Block or report ShotaKaji5207

Stars

rlresearch / dr-tulu

thinking-machines-lab / tinker-cookbook

rllm-org / rllm

ChenxinAn-fdu / POLARIS

rasbt / LLMs-from-scratch

simplescaling / s1

ServiceNow / PipelineRL

PrimeIntellect-ai / verifiers

nishadsinghi / sc-genrm-scaling

alibaba / ROLL

PRIME-RL / PRIME

McGill-NLP / nano-aha-moment

modelscope / ms-swift