Stars
TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25
A bidirectional pipeline parallelism algorithm for computation-communication overlap in DeepSeek V3/R1 training.
[COLING 2025] Automated Molecular Concept Generation and Labeling with Large Language Models
Official codebase for the Scattered Forest Search: Smarter Code Space Exploration and Inference Scaling with LLMs
DataSciBench: An LLM Agent Benchmark for Data Science
The website of paper "Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search"
Repository for Data Distillation for Offline Reinforcement Learning
Sci-BeRT model for paper reference source tracing. Submission for 2024 PST-KDD Cup.
Code and Data for "MIRAI: Evaluating LLM Agents for Event Forecasting"
Course project for CS 145 - KDD 2024 AQA Challenge
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)
The official repo of paper "Self-Control of LLM Behaviors by Compressing Suffix Gradient into Prefix Controller"
Enhancing Large Vision Language Models with Self-Training on Image Comprehension.
Path-RAG: Knowledge-Guided Key Region Retrieval for Open-ended Pathology Visual Question Answering
🤝 The code for "Can Large Language Model Agents Simulate Human Trust Behaviors?"
HLSyn benchmark for paper "Towards a Comprehensive Benchmark for FPGA Targeted High-Level Synthesis"
Official code for Symbolic Music Generation with Non-Differentiable Rule Guided Diffusion (ICML 2024, Oral).
The official implementation of Self-Play Fine-Tuning (SPIN)
Reference implementation for DPO (Direct Preference Optimization)
SciGLM: Training Scientific Language Models with Self-Reflective Instruction Annotation and Tuning (NeurIPS D&B Track 2024)
UCLA-DM / HLSyn
Forked from ZongyueQin/HLSynHLSyn benchmark for paper "Towards a Comprehensive Benchmark for FPGA Targeted High-Level Synthesis"
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
This repository contains a LLM benchmark for the social deduction game `Resistance Avalon'
Learning to Group Auxiliary Datasets for Molecule, NeurIPS2023