Starred repositories
slime is an LLM post-training framework for RL Scaling.
The official repository of The Road Less Traveled: Enhancing Exploration in LLMs via Sequential Sampling.
[ICLR 2025 Oralπ₯] SD-LoRA: Scalable Decoupled Low-Rank Adaptation for Class Incremental Learning
Production-ready platform for agentic workflow development.
SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]
JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf
[ICLR 2023] ReAct: Synergizing Reasoning and Acting in Language Models
Train transformer language models with reinforcement learning.
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
A Datacenter Scale Distributed Inference Serving Framework
π The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
Build resilient language agents as graphs.
π€ Assemble, configure, and deploy autonomous AI Agents in your browser.
π Make websites accessible for AI agents. Automate tasks online with ease.
A live stream development of RL tunning for LLM agents
No fortress, purely open ground. OpenManus is Coming.
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
Fine-tuning & Reinforcement Learning for LLMs. π¦₯ Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.
Autonomously train research-agent LLMs on custom data using reinforcement learning and self-verification.
Model Context Protocol Servers
A high-throughput and memory-efficient inference and serving engine for LLMs
A lightweight, powerful framework for multi-agent workflows
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
My learning notes/codes for ML SYS.