Stars
Shopping MMLU: A Multi-Task Online Shopping Benchmark for LLMs.
"RAG-Anything: All-in-One RAG Framework"
🎒 Token-Oriented Object Notation (TOON) – A compact, deterministic JSON format for LLM prompts. Spec, benchmarks, TypeScript SDK.
E2Rank: Your Text Embedding can Also be an Effective and Efficient Listwise Reranker
Detection and automatic updating of Korean datasets uploaded to Hugging Face
nanorlhf: from-scratch journey into how LLMs and RLHF really work.
The absolute trainer to light up AI agents.
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning
WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?
Tongyi Deep Research, the Leading Open-source Deep Research Agent
AgentLab: An open-source framework for developing, testing, and benchmarking web agents on diverse tasks, designed for scalability and reproducibility.
Make beautiful isometric infrastructure diagrams
A simple JSON parser specifically designed to handle malformed JSON output from Large Language Models (LLMs) like GPT, Claude, and others.
In-depth tutorials on LLMs, RAGs and real-world AI agent applications.
This open-source curriculum introduces the fundamentals of Model Context Protocol (MCP) through real-world, cross-language examples in .NET, Java, TypeScript, JavaScript, Rust and Python. Designed …
Convert any PDF into a podcast episode!
LangGraph ReAct Agents with an ability to use MCP Collaboration Tools dynamically