Skip to content
View SivilTaram's full-sized avatar
🐕
Working on something
🐕
Working on something

Organizations

@buaase @sail-sg @MLNLP-World @sea-sailor

Block or report SivilTaram

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 69 3 Updated Nov 17, 2025

verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"

Python 1,227 107 Updated Oct 20, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 20,496 3,558 Updated Nov 29, 2025

Laser: Learn to Reason Efficiently with Adaptive Length-based Reward Shaping

Python 60 4 Updated May 22, 2025

Defeating the Training-Inference Mismatch via FP16

Python 159 13 Updated Nov 14, 2025

slime is an LLM post-training framework for RL Scaling.

Python 2,615 291 Updated Nov 29, 2025

The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution

Python 145 10 Updated Nov 28, 2025

NOFX: Defining the Next-Generation AI Trading Operating System. A multi-exchange Al trading platform(Binance/Hyperliquid/Aster) with multi-Ai competition(deepseek/qwen/gemini/claude)self-evolution,…

Go 8,201 2,158 Updated Nov 28, 2025

MiniMax-M2, a model built for Max coding & agentic workflows.

1,903 146 Updated Nov 13, 2025

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

Python 54 3 Updated Oct 13, 2025

Post-training with Tinker

Python 2,245 190 Updated Nov 25, 2025

The official github repo for "Training Optimal Large Diffusion Language Models", the first-ever large-scale diffusion language models scaling law..

Python 42 Updated Nov 6, 2025

All-in-One Sandbox for AI Agents that combines Browser, Shell, File, MCP and VSCode Server in a single Docker container.

Python 1,525 138 Updated Nov 28, 2025

Checkpoint-engine is a simple middleware to update model weights in LLM inference engines

Python 849 68 Updated Nov 24, 2025

Official Code for "Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search"

Python 370 15 Updated Sep 15, 2025
Python 50 5 Updated Jun 7, 2025

End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Python 330 18 Updated Sep 22, 2025

[NeurIPS 2025] The official repo of SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond

Python 187 21 Updated Jul 7, 2025
Python 48 7 Updated Aug 21, 2025

GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Python 1,758 102 Updated Oct 28, 2025

SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for High-Performance Issue Resolution

Python 98 5 Updated Sep 24, 2025

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,316 1,947 Updated Nov 1, 2025

Renderer for the harmony response format to be used with gpt-oss

Rust 4,038 233 Updated Nov 5, 2025

The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >74% on SWE-bench verified!

Python 2,162 258 Updated Nov 26, 2025

SkyRL: A Modular Full-stack RL Library for LLMs

Python 1,282 187 Updated Nov 27, 2025

The absolute trainer to light up AI agents.

Python 9,026 723 Updated Nov 29, 2025

Qwen Code is a coding agent that lives in the digital world.

TypeScript 15,925 1,334 Updated Nov 29, 2025
Python 43 7 Updated Oct 28, 2025

Kimi K2 is the large language model series developed by Moonshot AI team

9,612 684 Updated Nov 7, 2025
Next