Lists (20)
Sort Name ascending (A-Z)
Stars
Plan-R1: Safe and Feasible Trajectory Planning as Language Modeling
A tiny deep learning training framework implemented from scratch in C++ that follows PyTorch's API.
Reinforcement Learning Tutorial with Demo: DP (Policy and Value Iteration), Monte Carlo, TD Learning (SARSA, QLearning), Function Approximation, Policy Gradient, DQN, Imitation, Meta Learning, Pape…
Code for the paper "Language Models are Unsupervised Multitask Learners"
Wife approved HomeOps driven by Kubernetes and GitOps using Flux
My GitOps-managed home Kubernetes cluster... and more! ⛵
GRPO training code which scales to 32xH100s for long horizon terminal/coding tasks. Base agent is now the top Qwen3 agent on Stanford's TerminalBench leaderboard.
A benchmark for LLMs on complicated tasks in the terminal
Lightweight coding agent that runs in your terminal
A C library for creating Excel XLSX files.
💫 Toolkit to help you get started with Spec-Driven Development
antgroup / ant-ray
Forked from ray-project/rayRay is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads. AntRay is forked from ray, offering incremental new features on top …
slime is an LLM post-training framework for RL Scaling.
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…
Trae Agent is an LLM-based agent for general purpose software engineering tasks.
Spark RAPIDS plugin - accelerate Apache Spark with GPUs
AGENTS.md — a simple, open format for guiding coding agents
Open source software for autonomous drones.
This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
TARE Exploration Planner for Ground Vehicles
An elegant PyTorch deep reinforcement learning library.
C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.