Stars
Utility to convert between various subscription format
[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning
Parse partial JSON generated by LLM
The official repo for paper, LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods.
The official code and data for paper "VidEgoThink: Assessing Egocentric Video Understanding Capabilities for Embodied AI"
Permanent Apple Intelligence + Xcode Predictive Code Completion for Chinese-market Mac computers
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
MLNLP社区用来帮助缩短参考文献的工具。A tool for simplifying bibtex with official info
A simple tool to update bib entries with their official information (e.g., DBLP or the ACL anthology).
⚡FlashRAG: A Python Toolkit for Efficient RAG Research (WWW2025 Resource)
Firefly: 大模型训练工具,支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型
Official Repo for ICLR 2024 paper MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback by Xingyao Wang*, Zihan Wang*, Jiateng Liu, Yangyi Chen, Lifan Yuan, Hao Peng and …
利用HuggingFace的官方下载工具从镜像网站进行高速下载。
Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluate the quality of an LLM
🔥🔥🔥 [IEEE TCSVT] Latest Papers, Codes and Datasets on Vid-LLMs.
Generative Agents: Interactive Simulacra of Human Behavior
Recipes to train reward model for RLHF.
Sotopia: an Open-ended Social Learning Environment (ICLR 2024 spotlight)
A new tool learning benchmark aiming at well-balanced stability and reality, based on ToolBench.
A library for advanced large language model reasoning
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
ToolBench, an evaluation suite for LLM tool manipulation capabilities.