-
CUNY Grad Center
- NYC
Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Stars
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning
Tongyi Deep Research, the Leading Open-source Deep Research Agent
🥢像老乡鸡🐔那样做饭。主要部分于2024年完工,非老乡鸡官方仓库。文字来自《老乡鸡菜品溯源报告》,并做归纳、编辑与整理。CookLikeHOC.
Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
Simplifying reinforcement learning for complex game environments
Qwen3-Coder is the code version of Qwen3, the large language model series developed by Qwen team, Alibaba Cloud.
Lightweight coding agent that runs in your terminal
Frequency Autoregressive Image Generation with Continuous Tokens
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.
Train your own SOTA deductive reasoning model
Visualizing the attention of vision-language models
DeepEP: an efficient expert-parallel communication library
FlashMLA: Efficient Multi-head Latent Attention Kernels
OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
Fully open reproduction of DeepSeek-R1
The official implementation of Self-Play Fine-Tuning (SPIN)
Minimal reproduction of DeepSeek R1-Zero
A Gaggia Classic control project using microcontrollers.
even-realities g1 smart glasses ble control pip package
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
A dummy's guide to setting up (and using) HPC clusters on Ubuntu 22.04LTS using Slurm and Munge. Created by the Quant Club @ UIowa.