GongChenyu

Follow

Chenyu Gong GongChenyu

Follow

4 followers · 0 following

Achievements

Achievements

Highlights

Pro

Stars

ZipXuan / FlashAttentionGPTonCPU

Python 1 Updated Jan 28, 2024

erfanzar / jax-flash-attn2

A flexible and efficient implementation of Flash Attention 2.0 for JAX, supporting multiple backends (GPU/TPU/CPU) and platforms (Triton/Pallas/JAX).

Python 30 1 Updated Mar 4, 2025

andrewkchan / yalm

Yet Another Language Model: LLM inference in C++/CUDA, no libraries except for I/O

C++ 535 49 Updated Sep 13, 2025

dianhsu / transformer-cpp-cpu

用C++实现一个简单的Transformer模型。 Attention Is All You Need。

C++ 52 9 Updated Mar 11, 2021

onemorething1 / yecao100

yecao100的公告栏

JavaScript 188 11 Updated Jul 28, 2022

NVIDIA / TensorRT-LLM

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

C++ 12,256 1,896 Updated Nov 28, 2025

microsoft / vidur

A large-scale simulation framework for LLM inference

Python 486 91 Updated Jul 25, 2025

ray-project / llmperf

LLMPerf is a library for validating and benchmarking LLMs

Python 1,051 197 Updated Dec 9, 2024

evilsocket / cake

Distributed LLM and StableDiffusion inference for mobile, desktop and server.

Rust 2,893 171 Updated Oct 23, 2024

mental2008 / awesome-papers

Here are my personal paper reading notes (including cloud computing, resource management, systems, machine learning, deep learning, and other interesting stuffs).

137 6 Updated Nov 4, 2025

s5uishida / open5gs_5gc_ueransim_sample_config

Open5GS 5GC & UERANSIM UE / RAN Sample Configuration

77 20 Updated May 5, 2024

xlite-dev / Awesome-LLM-Inference

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 4,760 324 Updated Nov 28, 2025

hahnyuan / LLM-Viewer

Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.

Python 584 71 Updated Sep 11, 2024

AmadeusChan / Awesome-LLM-System-Papers

610 30 Updated May 10, 2025

liaokongVFX / MCP-Chinese-Getting-Started-Guide

Model Context Protocol(MCP) 编程极速入门

3,134 187 Updated Apr 23, 2025

ag-ui-protocol / ag-ui

AG-UI: the Agent-User Interaction Protocol. Bring Agents into Frontend Applications.

TypeScript 10,027 925 Updated Nov 28, 2025

kvcache-ai / ktransformers

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 15,995 1,165 Updated Nov 28, 2025

ninehills / llm-inference-benchmark

LLM Inference benchmark

Python 431 40 Updated Jul 23, 2024

rasbt / LLMs-from-scratch

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 80,017 11,910 Updated Nov 25, 2025

a2aproject / A2A

An open protocol enabling communication and interoperability between opaque agentic applications.

Shell 20,838 2,127 Updated Nov 26, 2025

HenryPengZou / Awesome-Human-Agent-Collaboration-Interaction-Systems

LLM-Based Human-Agent Collaboration and Interaction Systems: A Survey | Awesome Human-Agent Collaboration | Human-AI Collaboration

170 6 Updated Nov 24, 2025

SkyworkAI / Skywork-R1V

Skywork-R1V is an advanced multimodal AI model series developed by Skywork AI (Kunlun Inc.), specializing in vision-language reasoning.

Python 3,103 272 Updated Nov 19, 2025

openai / openai-agents-python

A lightweight, powerful framework for multi-agent workflows

Python 17,567 2,918 Updated Nov 27, 2025

punkpeye / awesome-mcp-servers

A collection of MCP servers.

75,799 6,390 Updated Nov 28, 2025

taishan1994 / DGL_Chinese_Manual

DGL中文文档。This is the Chinese manual of the graph neural network library DGL, currently contains the User Guide.

77 14 Updated Feb 16, 2022

FoundationAgents / OpenManus

No fortress, purely open ground. OpenManus is Coming.

Python 51,037 8,904 Updated Nov 17, 2025

lokali / PFSS-IL

Python 16 Updated Dec 17, 2023

ruizhangjie / FJS-IDR-GRL

OpenEdge ABL 3 3 Updated Oct 23, 2023

kvcache-ai / Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 4,334 446 Updated Nov 28, 2025

mutinifni / splitwise-sim

LLM serving cluster simulator

Jupyter Notebook 122 12 Updated Apr 25, 2024