-
ETO Public
Forked from Yifan-Song793/ETOTrial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)
Python UpdatedJun 5, 2024 -
LLMBox Public
Forked from RUCAIBox/LLMBoxA comprehensive library for implementing LLMs, including a unified training pipeline and comprehensive model evaluation.
Python MIT License UpdatedMay 27, 2024 -
MAmmoTH Public
Forked from TIGER-AI-Lab/MAmmoTHCode and data for "MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning" (ICLR 2024)
Jupyter Notebook UpdatedMay 8, 2024 -
ArCHer Public
Forked from YifeiZhou02/ArCHerResearch Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"
Python UpdatedMar 30, 2024 -
ACORM Public
Forked from NJU-RL/ACORMAttention-guided Contrastive Role Representations for Multi-agent Reinforcement Learning(ICLR 2024)
-
ReAct Public
Forked from ysymyth/ReAct[ICLR 2023] ReAct: Synergizing Reasoning and Acting in Language Models
Jupyter Notebook MIT License UpdatedFeb 6, 2024 -
MetaMath Public
Forked from meta-math/MetaMathMetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
Python Apache License 2.0 UpdatedFeb 1, 2024 -
AgentTuning Public
Forked from THUDM/AgentTuningAgentTuning: Enabling Generalized Agent Abilities for LLMs
Python UpdatedOct 31, 2023 -
-
llama-trl Public
Forked from jasonvanf/llama-trlLLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA
Python Apache License 2.0 UpdatedMay 23, 2023 -
ODIS Public
Forked from LAMDA-RL/ODISThe implementation of ICLR-2023 paper "Discovering Generalizable Multi-agent Coordination Skills from Multi-task Offline Data".
Python Apache License 2.0 UpdatedMar 6, 2023 -
google-research Public
Forked from google-research/google-researchGoogle Research
Jupyter Notebook Apache License 2.0 UpdatedDec 28, 2022 -
football Public
Forked from google-research/footballCheck out the new game server:
Python Apache License 2.0 UpdatedSep 25, 2022 -
MARL-Algorithms Public
Forked from starry-sky6688/MARL-AlgorithmsImplementations of IQL, QMIX, VDN, COMA, QTRAN, MAVEN, CommNet, DyMA-CL, and G2ANet on SMAC, the decentralised micromanagement scenario of StarCraft II
Python UpdatedSep 8, 2022 -
-
-
pytorch_rl2 Public
Forked from lucaslingle/pytorch_rl2Implementation of 'RL^2: Fast Reinforcement Learning via Slow Reinforcement Learning'
Python UpdatedJan 1, 2022 -
Multi-Agent-Reinforcement-Learning-Environment Public
Forked from zhuyifengzju/Multi-Agent-Reinforcement-Learning-EnvironmentHello, I pushed some python environments for Multi Agent Reinforcement Learning.
Python UpdatedMay 1, 2020