-
University of Chinese Academy of Sciences
- China
-
21:30
(UTC +08:00) - https://www.ucas.edu.cn/
Highlights
- Pro
Stars
Recovery-Bench is a benchmark for evaluating the capability of LLM agents to recover from mistakes
Letta is the platform for building stateful agents: open AI with advanced memory that can learn and self-improve over time.
📑 PageIndex: Document Index for Reasoning-based RAG
[COLM 2024] LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition
Tongyi Deep Research, the Leading Open-source Deep Research Agent
A collection of notebooks/recipes showcasing some fun and effective ways of using Claude.
DeerFlow is a community-driven Deep Research framework, combining language models with tools like web search, crawling, and Python execution, while contributing back to the open-source community.
WideSearch: Benchmarking Agentic Broad Info-Seeking
Trae Agent is an LLM-based agent for general purpose software engineering tasks.
The absolute trainer to light up AI agents.
slime is an LLM post-training framework for RL Scaling.
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
An open-source, code-first Python toolkit for building, evaluating, and deploying sophisticated AI agents with flexibility and control.
Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"
[TMLR 2025] Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models
Experimental playground for benchmarking language model (LM) architectures, layers, and tricks on smaller datasets. Designed for flexible experimentation and exploration.
Gemini is a modern LaTex beamerposter theme 🖼
Tools for merging pretrained large language models.
MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model.
🚀 Efficient implementations of state-of-the-art linear attention models
Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
verl: Volcano Engine Reinforcement Learning for LLMs
[NeurIPS'24 Spotlight, ICLR'25, ICML'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filli…
Code for paper: [ICLR2025 Oral] FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference