Skip to content
View zhentingqi's full-sized avatar

Highlights

  • Pro

Block or report zhentingqi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 17,254 1,315 Updated Nov 19, 2025

Post-training with Tinker

Python 2,127 171 Updated Nov 18, 2025
Python 15 3 Updated Jun 23, 2025
Python 11 1 Updated Jun 5, 2025

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 95,220 25,954 Updated Nov 20, 2025

Latest Advances on Embodied Multimodal LLMs (or Vison-Language-Action Models).

121 6 Updated Jul 4, 2024
Python 3 Updated Jun 8, 2025

[NeurIPS'25] Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"

Python 620 51 Updated Mar 16, 2025

[ICML 2025] Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search

Python 108 6 Updated Jun 3, 2025

Official Repo for Open-Reasoner-Zero

Python 2,065 119 Updated Jun 2, 2025

AllenAI's post-training codebase

Python 3,301 458 Updated Nov 20, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 16,144 2,597 Updated Nov 19, 2025

Agentless🐱: an agentless approach to automatically solve software development problems

Python 1,965 213 Updated Dec 22, 2024

DSPy: The framework for programming—not prompting—language models

Python 30,124 2,416 Updated Nov 18, 2025

A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.

Python 248 10 Updated Apr 15, 2025

Minimalistic large language model 3D-parallelism training

Python 2,326 257 Updated Sep 3, 2025

[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Python 2,321 331 Updated Nov 19, 2025

šŸ™Œ OpenHands: Code Less, Make More

Python 65,107 7,934 Updated Nov 20, 2025

🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.

Python 568 47 Updated Oct 31, 2025

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends

Python 2,125 384 Updated Nov 18, 2025

Fully open reproduction of DeepSeek-R1

Python 25,656 2,404 Updated Sep 8, 2025

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)

Python 8,423 816 Updated Nov 9, 2025

Data and Code for CVPR 2025 paper "MMVU: Measuring Expert-Level Multi-Discipline Video Understanding"

Python 75 1 Updated Feb 28, 2025

CRUXEval: Code Reasoning, Understanding, and Execution Evaluation

Python 158 26 Updated Oct 11, 2024

A generative world for general-purpose robotics & embodied AI learning.

Python 27,652 2,549 Updated Nov 19, 2025

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 39,918 6,920 Updated Nov 20, 2025

[ICML'24] Magicoder: Empowering Code Generation with OSS-Instruct

Python 2,059 170 Updated Nov 1, 2024

Entropy Based Sampling and Parallel CoT Decoding

Python 3,424 324 Updated Nov 13, 2024
Next