akzaidi

Follow

Ali Zaidi akzaidi

Follow

https://twitter.com/alikzaidi

140 followers · 234 following

Achievements

Achievements

Highlights

Pro

Organizations

Starred repositories

PRIME-RL / SimpleVLA-RL

SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning

Python 977 48 Updated Oct 13, 2025

NVIDIA-NeMo / RL

Scalable toolkit for efficient model reinforcement

Python 1,022 166 Updated Nov 13, 2025

TheRobotStudio / SO-ARM100

Standard Open Arm 100

4,454 373 Updated Oct 15, 2025

sethkarten / pokeagent-speedrun

Official repository of the NeurIPS 2025 Competition: The PokeAgent Challenge: Competitive and Long-Context Learning at Scale. (Track 2, Speedrunning)

Python 62 27 Updated Nov 6, 2025

steveyegge / beads

Beads - A memory upgrade for your coding agent

Go 2,726 173 Updated Nov 12, 2025

Nathan-Li123 / SMOTer

[ECCV 2024] Beyond MOT: Semantic Multi-Object Tracking

Python 53 Updated Nov 19, 2024

escontra / score_matching_rl

Code for the paper "Learning a Diffusion Model Policy from Rewards via Q-Score Matching"

Python 29 4 Updated Apr 15, 2025

PRIME-RL / RL-Compositionality

FROM $f(x)$ AND $g(x)$ TO $f(g(x))$: LLMs Learn New Skills in RL by Composing Old Ones

Python 34 3 Updated Nov 7, 2025

meta-pytorch / torchforge

PyTorch-native post-training at scale

Python 520 54 Updated Nov 13, 2025

chi-feng / mcmc-demo

Interactive Markov-chain Monte Carlo Javascript demos

JavaScript 893 124 Updated Jun 4, 2024

meta-pytorch / OpenEnv

An interface library for RL post training with environments.

Python 687 97 Updated Nov 13, 2025

microsoft / MoCapAct

A Multi-Task Dataset for Simulated Humanoid Control

Python 198 22 Updated Mar 27, 2025

psalias2006 / gpu-hot

🔥 Real-time NVIDIA GPU dashboard

JavaScript 862 44 Updated Nov 2, 2025

thinking-machines-lab / tinker-cookbook

Post-training with Tinker

Python 1,872 149 Updated Nov 12, 2025

McGill-NLP / the-markovian-thinker

Code for paper "The Markovian Thinker: Architecture-Agnostic Linear Scaling of Reasoning"

Python 310 25 Updated Nov 13, 2025

xbpeng / MimicKit

Suite of motion imitation methods for training controllers.

Python 920 91 Updated Nov 9, 2025

sachaos / viddy

👀 A modern watch command. Time machine and pager etc.

Rust 5,187 97 Updated Aug 29, 2025

snowflakedb / ArcticInference

ArcticInference: vLLM plugin for high-throughput, low-latency inference

Python 299 38 Updated Nov 11, 2025

karpathy / nanochat

The best ChatGPT that $100 can buy.

Python 36,570 4,394 Updated Nov 5, 2025

TeXlyre / texlyre

A local-first LaTeX & Typst web editor with real-time collaboration & offline support

TypeScript 478 21 Updated Nov 11, 2025

microsoft / mcp-interviewer

Catch MCP server issues before your agents do.

Python 128 15 Updated Oct 27, 2025

Keen-Technologies / physical_atari

Platform for evaluating reinforcement learning (RL) algorithms on a physical Atari system.

Python 128 2 Updated Aug 28, 2025

llm-d / llm-d

Achieve state of the art inference performance with modern accelerators on Kubernetes

Shell 2,029 230 Updated Nov 11, 2025

microsoft / Olive

Olive: Simplify ML Model Finetuning, Conversion, Quantization, and Optimization for CPUs, GPUs and NPUs.

Python 2,181 258 Updated Nov 13, 2025

microsoft / agent-framework

A framework for building, orchestrating and deploying AI agents and multi-agent workflows with support for Python and .NET.

Python 5,005 721 Updated Nov 13, 2025

zhouzypaul / wsrl

JAX implementation of WSRL and RL baselines | ICLR 2025

Python 116 13 Updated Jul 11, 2025

gradio-app / trackio

A lightweight, local-first, and 🆓 experiment tracking library from Hugging Face 🤗

Python 1,078 66 Updated Nov 7, 2025

wmn-231314 / diffusion-data-constraint

Official PyTorch implementation and models for paper "Diffusion Beats Autoregressive in Data-Constrained Settings". We find diffusion models are significantly more data-efficient than standard left…

Python 105 2 Updated Oct 27, 2025

OpenPipe / ART

Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!

Python 7,817 602 Updated Nov 12, 2025

elder-plinius / CL4R1T4S

LEAKED SYSTEM PROMPTS FOR CHATGPT, GEMINI, GROK, CLAUDE, PERPLEXITY, CURSOR, DEVIN, REPLIT, AND MORE! - AI SYSTEMS TRANSPARENCY FOR ALL! 👐

11,807 2,379 Updated Nov 6, 2025

Starred topics

audio-source-separation

pytorch-rl

language-grounding