Skip to content
View Zefan-Cai's full-sized avatar

Highlights

  • Pro

Block or report Zefan-Cai

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model

Python 1,149 45 Updated Jun 8, 2025

Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"

Python 618 52 Updated Oct 23, 2025

🙌 OpenHands: Code Less, Make More

Python 64,596 7,842 Updated Oct 31, 2025

The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >70% on SWE-bench verified!

Python 1,964 213 Updated Oct 27, 2025

Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]

Jupyter Notebook 562 34 Updated Jul 29, 2025

Contexts Optical Compression

Python 18,834 1,270 Updated Oct 25, 2025

Code release for paper "Test-Time Training Done Right"

Python 306 14 Updated Sep 8, 2025

🔥 LLM-powered GPU kernel synthesis: Train models to convert PyTorch ops into optimized Triton kernels via SFT+RL. Multi-turn compilation feedback, cross-platform NVIDIA/AMD, Kernelbook + KernelBench

Python 91 2 Updated Oct 9, 2025

The official implementation for [NeurIPS2025 Oral] Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

Jupyter Notebook 100 6 Updated Sep 19, 2025

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 16,013 3,169 Updated Oct 31, 2025

LEAKED SYSTEM PROMPTS FOR CHATGPT, GEMINI, GROK, CLAUDE, PERPLEXITY, CURSOR, DEVIN, REPLIT, AND MORE! - AI SYSTEMS TRANSPARENCY FOR ALL! 👐

11,585 2,341 Updated Oct 29, 2025

[ICML 2025 Spotlight] ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference

Python 264 18 Updated May 1, 2025

FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.

Python 301 18 Updated Aug 7, 2025

A Model Context Protocol (MCP) tool server for OpenAI's GPT-4o/gpt-image-1 image generation and editing APIs.

TypeScript 77 24 Updated May 31, 2025

An open-source AI agent that brings the power of Gemini directly into your terminal.

TypeScript 81,037 8,997 Updated Oct 31, 2025

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,029 1,888 Updated Oct 23, 2025

The power of Claude Code / GeminiCLI / CodexCLI + [Gemini / OpenAI / OpenRouter / Azure / Grok / Ollama / Custom Model / All Of The Above] working as one.

Python 9,398 783 Updated Oct 22, 2025

MCP server that enables AI assistants to interact with Google Gemini CLI, leveraging Gemini's massive token window for large file analysis and codebase understanding

TypeScript 1,531 118 Updated Aug 11, 2025

本仓库包含对 Claude Code v1.0.33 进行逆向工程的完整研究和分析资料。包括对混淆源代码的深度技术分析、系统架构文档,以及重构 Claude Code agent 系统的实现蓝图。主要发现包括实时 Steering 机制、多 Agent 架构、智能上下文管理和工具执行管道。该项目为理解现代 AI agent 系统设计和实现提供技术参考。

JavaScript 11,059 2,915 Updated Jul 19, 2025

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 39,616 6,847 Updated Oct 31, 2025

Everything about the SmolLM and SmolVLM family of models

Python 3,353 231 Updated Sep 16, 2025

Roo Code gives you a whole dev team of AI agents in your code editor.

TypeScript 20,477 2,353 Updated Oct 31, 2025

A CLI tool for analyzing Claude Code/Codex CLI usage from local JSONL files.

TypeScript 8,756 270 Updated Oct 27, 2025

🌊 The leading agent orchestration platform for Claude. Deploy intelligent multi-agent swarms, coordinate autonomous workflows, and build conversational AI systems. Features enterprise-grade archite…

JavaScript 9,444 1,249 Updated Oct 27, 2025

Real-time Claude Code usage monitor with predictions and warnings

Python 5,585 268 Updated Sep 14, 2025

A powerful GUI app and Toolkit for Claude Code - Create custom agents, manage interactive Claude Code sessions, run secure background agents, and more.

TypeScript 18,509 1,409 Updated Oct 16, 2025

M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models

Python 44 2 Updated Jul 17, 2025

Nano vLLM

Python 7,255 934 Updated Aug 31, 2025
Next