Stars
Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.
AG-UI: the Agent-User Interaction Protocol. Bring Agents into Frontend Applications.
Learn to build from scratch an AI PR reviewer integrated with GitHub, Slack and Asana that scales within your organization.
This repository delivers end-to-end, code-first tutorials covering every layer of production-grade GenAI agents, guiding you from spark to scale with proven patterns and reusable blueprints for re…
Collection of awesome LLM apps with AI Agents and RAG using OpenAI, Anthropic, Gemini and opensource models.
This repository contains the Hugging Face Agents Course.
This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."
List of papers on hallucination detection in LLMs.
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models
Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!
The LLM's practical guide: From the fundamentals to deploying advanced LLM and RAG apps to AWS using LLMOps best practices
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.
LLM Zoomcamp - a free online course about real-life applications of LLMs. In 10 weeks you will learn how to build an AI system that answers questions about your knowledge base.
This repository is a curated collection of the most exciting and influential CVPR 2024 papers. 🔥 [Paper + Code + Demo]
📚 List of awesome university courses for learning Computer Science!
Interactive roadmaps, guides and other educational content to help developers grow in their careers.
Supercharge Your LLM Application Evaluations 🚀
Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthr…
Kickstart your MLOps initiative with a flexible, robust, and productive Python package.
This repo lists relevant papers summarized in our survey paper: A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models.
A reactive notebook for Python — run reproducible experiments, query with SQL, execute as a script, deploy as an app, and version with git. Stored as pure Python. All in a modern, AI-native editor.