Skip to content
View user074's full-sized avatar
  • CUNY Grad Center
  • NYC

Highlights

  • Pro

Block or report user074

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

PyTorch-native post-training at scale

Python 515 53 Updated Nov 12, 2025

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Python 161 17 Updated Sep 18, 2025

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 17,106 1,301 Updated Nov 10, 2025

🥢像老乡鸡🐔那样做饭。主要部分于2024年完工,非老乡鸡官方仓库。文字来自《老乡鸡菜品溯源报告》,并做归纳、编辑与整理。CookLikeHOC.

JavaScript 22,026 2,214 Updated Oct 17, 2025

Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.

Python 15,641 1,168 Updated Nov 12, 2025

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

Python 8,262 890 Updated Jul 8, 2025
Python 53 4 Updated May 31, 2025

Simplifying reinforcement learning for complex game environments

C 4,184 304 Updated Nov 12, 2025

Qwen3-Coder is the code version of Qwen3, the large language model series developed by Qwen team, Alibaba Cloud.

Python 14,308 991 Updated Jul 31, 2025

Lightweight coding agent that runs in your terminal

Rust 50,313 6,255 Updated Nov 12, 2025

Frequency Autoregressive Image Generation with Continuous Tokens

Python 92 4 Updated Jun 9, 2025

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 4,038 299 Updated Nov 3, 2025

DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.

Python 1,891 134 Updated Dec 6, 2024
JavaScript 3,693 1,588 Updated Jun 21, 2024

Train your own SOTA deductive reasoning model

Python 108 8 Updated Mar 6, 2025

Visualizing the attention of vision-language models

Jupyter Notebook 252 20 Updated Feb 28, 2025

DeepEP: an efficient expert-parallel communication library

Cuda 8,715 986 Updated Nov 6, 2025

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 11,858 899 Updated Sep 30, 2025

OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

Python 1,825 134 Updated Jan 17, 2025

Fully open reproduction of DeepSeek-R1

Python 25,633 2,398 Updated Sep 8, 2025

The official implementation of Self-Play Fine-Tuning (SPIN)

Python 1,215 102 Updated May 8, 2024

Minimal reproduction of DeepSeek R1-Zero

Python 12,389 1,521 Updated Apr 24, 2025

ccvcl website

HTML 1 Updated Oct 14, 2025

A Gaggia Classic control project using microcontrollers.

2,324 344 Updated Nov 6, 2025

even-realities g1 smart glasses ble control pip package

Python 72 15 Updated Nov 24, 2024

Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

Python 3,354 545 Updated Nov 8, 2025

One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks

Python 3,275 423 Updated Nov 12, 2025

A dummy's guide to setting up (and using) HPC clusters on Ubuntu 22.04LTS using Slurm and Munge. Created by the Quant Club @ UIowa.

376 32 Updated Apr 3, 2024
Next