huzican

Zican Hu huzican

A bug maker, a shit mountain builder

23 followers · 42 following

Nanjing University
China

Achievements

ETO Public
Forked from Yifan-Song793/ETO

Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)

Python Updated Jun 5, 2024
LLMBox Public
Forked from RUCAIBox/LLMBox

A comprehensive library for implementing LLMs, including a unified training pipeline and comprehensive model evaluation.

Python MIT License Updated May 27, 2024
MAmmoTH Public
Forked from TIGER-AI-Lab/MAmmoTH

Code and data for "MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning" (ICLR 2024)

Jupyter Notebook Updated May 8, 2024
ArCHer Public
Forked from YifeiZhou02/ArCHer

Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"

Python Updated Mar 30, 2024
ACORM Public
Forked from NJU-RL/ACORM

Attention-guided Contrastive Role Representations for Multi-agent Reinforcement Learning(ICLR 2024)

Python 1 Updated Mar 2, 2024
ReAct Public
Forked from ysymyth/ReAct

[ICLR 2023] ReAct: Synergizing Reasoning and Acting in Language Models

Jupyter Notebook MIT License Updated Feb 6, 2024
MetaMath Public
Forked from meta-math/MetaMath

MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

Python Apache License 2.0 Updated Feb 1, 2024
AgentTuning Public
Forked from THUDM/AgentTuning

AgentTuning: Enabling Generalized Agent Abilities for LLMs

Python Updated Oct 31, 2023
v202 Public
Forked from mlresearch/v202

Proceedings of ICML 2023

TeX Updated Oct 27, 2023
llama-trl Public
Forked from jasonvanf/llama-trl

LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA

Python Apache License 2.0 Updated May 23, 2023
ODIS Public
Forked from LAMDA-RL/ODIS

The implementation of ICLR-2023 paper "Discovering Generalizable Multi-agent Coordination Skills from Multi-task Offline Data".

Python Apache License 2.0 Updated Mar 6, 2023
google-research Public
Forked from google-research/google-research

Google Research

Jupyter Notebook Apache License 2.0 Updated Dec 28, 2022
football Public
Forked from google-research/football

Check out the new game server:

Python Apache License 2.0 Updated Sep 25, 2022
MARL-Algorithms Public
Forked from starry-sky6688/MARL-Algorithms

Implementations of IQL, QMIX, VDN, COMA, QTRAN, MAVEN, CommNet, DyMA-CL, and G2ANet on SMAC, the decentralised micromanagement scenario of StarCraft II

Python Updated Sep 8, 2022
15th_SmartCar_SoundBeacon Public

第十五届全国大学生智能汽车竞赛——声音信标组

C 11 1 Updated Mar 24, 2022
SmartCar_AI_Vision Public

第十六届全国大学生智能汽车竞赛——AI视觉组

C 3 Updated Mar 24, 2022
pytorch_rl2 Public
Forked from lucaslingle/pytorch_rl2

Implementation of 'RL^2: Fast Reinforcement Learning via Slow Reinforcement Learning'

Python Updated Jan 1, 2022
Multi-Agent-Reinforcement-Learning-Environment Public
Forked from zhuyifengzju/Multi-Agent-Reinforcement-Learning-Environment

Hello, I pushed some python environments for Multi Agent Reinforcement Learning.

Python Updated May 1, 2020

Zican Hu huzican

Achievements

Achievements

ETO Public

Uh oh!

LLMBox Public

Uh oh!

MAmmoTH Public

Uh oh!

ArCHer Public

Uh oh!

ACORM Public

Uh oh!

ReAct Public

Uh oh!

MetaMath Public

Uh oh!

AgentTuning Public

Uh oh!

v202 Public

Uh oh!

llama-trl Public

Uh oh!

ODIS Public

Uh oh!

google-research Public

Uh oh!

football Public

Uh oh!

MARL-Algorithms Public

Uh oh!

15th_SmartCar_SoundBeacon Public

Uh oh!

SmartCar_AI_Vision Public

Uh oh!

pytorch_rl2 Public

Uh oh!

Multi-Agent-Reinforcement-Learning-Environment Public

Uh oh!