Qin Hongzheng Cyan7Hz

🎯

Focusing

Popular repositories Loading

Logic-RL Logic-RL Public

Forked from Unakar/Logic-RL

Reproduce R1 Zero on Logic Puzzle

Python
OpenRLHF OpenRLHF Public

Forked from OpenRLHF/OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

Python
RecFM RecFM Public

Forked from USTCLLM/RecFM

Comprehensive tools and frameworks for developing foundation models tailored to recommendation systems.
LLMEraser LLMEraser Public

Forked from oceanoceanna/LLMEraser

For reproducing

Python