SivilTaram

🐕

Working on something

Qian SivilTaram

🐕

Working on something

Researcher @ TikTok

667 followers · 182 following

Researcher @ TikTok
Singapore
http://siviltaram.github.io/
@sivil_taram

Achievements

x3 x2

Achievements

x3 x2

Organizations

Lists (1)

Sort

✨ Inspiration

1 repository

Stars

deepseek-ai / DeepSeek-Math-V2

986 49 Updated Nov 27, 2025

arcee-ai / pybubble

Python 69 3 Updated Nov 17, 2025

langfengQ / verl-agent

verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"

Python 1,227 107 Updated Oct 20, 2025

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 20,496 3,558 Updated Nov 29, 2025

hkust-nlp / Laser

Laser: Learn to Reason Efficiently with Adaptive Length-based Reward Shaping

Python 60 4 Updated May 22, 2025

sail-sg / Precision-RL

Defeating the Training-Inference Mismatch via FP16

Python 159 13 Updated Nov 14, 2025

THUDM / slime

slime is an LLM post-training framework for RL Scaling.

Python 2,615 291 Updated Nov 29, 2025

hkust-nlp / Toolathlon

The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution

Python 145 10 Updated Nov 28, 2025

NoFxAiOS / nofx

NOFX: Defining the Next-Generation AI Trading Operating System. A multi-exchange Al trading platform(Binance/Hyperliquid/Aster) with multi-Ai competition(deepseek/qwen/gemini/claude)self-evolution,…

Go 8,201 2,158 Updated Nov 28, 2025

MiniMax-AI / MiniMax-M2

MiniMax-M2, a model built for Max coding & agentic workflows.

1,903 146 Updated Nov 13, 2025

bigcode-project / bigcodearena

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

Python 54 3 Updated Oct 13, 2025

thinking-machines-lab / tinker-cookbook

Post-training with Tinker

Python 2,245 190 Updated Nov 25, 2025

JinjieNi / Quokka

The official github repo for "Training Optimal Large Diffusion Language Models", the first-ever large-scale diffusion language models scaling law..

Python 42 Updated Nov 6, 2025

agent-infra / sandbox

All-in-One Sandbox for AI Agents that combines Browser, Shell, File, MCP and VSCode Server in a single Docker container.

Python 1,525 138 Updated Nov 28, 2025

MoonshotAI / checkpoint-engine

Checkpoint-engine is a simple middleware to update model weights in LLM inference engines

Python 849 68 Updated Nov 24, 2025

Mini-o3 / Mini-o3

Official Code for "Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search"

Python 370 15 Updated Sep 15, 2025

jdf-prog / LLM-Engines

Python 50 5 Updated Jun 7, 2025

ltzheng / SimpleTIR

End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Python 330 18 Updated Sep 22, 2025

MiniMax-AI / SynLogic

[NeurIPS 2025] The official repo of SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond

Python 187 21 Updated Jul 7, 2025

richardodliu / OpenCodeEval

Python 48 7 Updated Aug 21, 2025

zai-org / GLM-V

GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Python 1,758 102 Updated Oct 28, 2025

zhenyuhe00 / SWE-Swiss

SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for High-Performance Issue Resolution

Python 98 5 Updated Sep 24, 2025

openai / gpt-oss

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,316 1,947 Updated Nov 1, 2025

openai / harmony

Renderer for the harmony response format to be used with gpt-oss

Rust 4,038 233 Updated Nov 5, 2025

SWE-agent / mini-swe-agent

The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >74% on SWE-bench verified!

Python 2,162 258 Updated Nov 26, 2025

NovaSky-AI / SkyRL

SkyRL: A Modular Full-stack RL Library for LLMs

Python 1,282 187 Updated Nov 27, 2025

microsoft / agent-lightning

The absolute trainer to light up AI agents.

Python 9,026 723 Updated Nov 29, 2025

QwenLM / qwen-code

Qwen Code is a coding agent that lives in the digital world.

TypeScript 15,925 1,334 Updated Nov 29, 2025

SWE-Perf / SWE-Perf

Python 43 7 Updated Oct 28, 2025

MoonshotAI / Kimi-K2

Kimi K2 is the large language model series developed by Moonshot AI team

9,612 684 Updated Nov 7, 2025