Lists (1)
Sort Name ascending (A-Z)
Stars
Convert Word documents to beautiful Markdown. Via command line or in your browser.
HuggingFace conversion and training library for Megatron-based models
An open-source RAG-based tool for chatting with your documents.
[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards
Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse environments
bespokelabsai / verifiers
Forked from PrimeIntellect-ai/verifiersVerifiers for LLM Reinforcement Learning
Automatic, unsupervised collection of web agent training data via exploration.
MCP-based Agent Deep Evaluation System
τ²-Bench: Evaluating Conversational Agents in a Dual-Control Environment
A very quick project that transforms research papers into engaging three-person discussions, offering an intuitive and thought-provoking listening experience. Perfect for podcast enthusiasts seekin…
Open-Source Chrome extension for AI-powered web automation. Run multi-agent workflows using your own LLM API key. Alternative to OpenAI Operator.
MCPCorpus is a comprehensive dataset for analyzing the Model Context Protocol (MCP) ecosystem, containing ~14K MCP servers and 300 MCP clients with 20+ normalized metadata attributes.
Allow LLMs to control a browser with Browserbase and Stagehand
Repository of the paper "PlagBench: Exploring the Duality of Large Language Models in Plagiarism Generation and Detection", NAACL 2025.
A MCP (Model Context Protocol) server for PowerPoint manipulation using python-pptx. This server provides tools for creating, editing, and manipulating PowerPoint presentations through the MCP prot…
OmniGen2: Exploration to Advanced Multimodal Generation. https://arxiv.org/abs/2506.18871
Context7 MCP Server -- Up-to-date code documentation for LLMs and AI code editors
An Autonomous Agentic Framework for Reflective PowerPoint Generation
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
Agentless🐱: an agentless approach to automatically solve software development problems
Code for "[COLM'25] RepoST: Scalable Repository-Level Coding Environment Construction with Sandbox Testing"
open-source coding LLM for software engineering tasks
SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]
AI Agent that handles engineering tasks end-to-end: integrates with developers’ tools, plans, executes, and iterates until it achieves a successful result.