Skip to content
View zhooooong's full-sized avatar
  • 网商路电子厂
  • Hangzhou
  • 22:23 (UTC +08:00)

Block or report zhooooong

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

Python 94,066 10,626 Updated Nov 20, 2025

Aiming to integrate most existing feature caching-based diffusion acceleration schemes into a unified framework.

Python 78 8 Updated Oct 23, 2025

Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache).

Python 184 14 Updated Nov 17, 2025

[ICCV2025] From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers

Python 330 22 Updated Aug 11, 2025

FlashInfer: Kernel Library for LLM Serving

Cuda 4,106 573 Updated Nov 20, 2025

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 2,253 288 Updated Nov 20, 2025

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".

Python 2,221 186 Updated Mar 27, 2024

ONNX Optimizer

C++ 774 102 Updated Nov 2, 2025

[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"

Python 2,673 294 Updated Jul 31, 2024

Standardized Distributed Generative and Predictive AI Inference Platform for Scalable, Multi-Framework Deployment on Kubernetes

Shell 4,806 1,308 Updated Nov 20, 2025

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 12,390 2,273 Updated Nov 12, 2025

Fast and memory-efficient exact attention

Python 20,641 2,149 Updated Nov 19, 2025

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Python 10,057 1,674 Updated Nov 20, 2025

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

C++ 12,190 1,881 Updated Nov 20, 2025

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 152,775 31,187 Updated Nov 20, 2025

free C++ class library of cryptographic schemes

C++ 5,351 1,655 Updated Aug 1, 2024

🌚 🌍 🌝 GeoIP 规则文件加强版,支持自行定制 V2Ray dat 格式文件 geoip.dat、MaxMind mmdb 格式文件、sing-box SRS 格式文件、mihomo MRS 格式文件、Clash ruleset、Surge ruleset 等。Enhanced edition of GeoIP files for V2Ray, Xray-core, sing-box,…

Go 5,267 800 Updated Nov 20, 2025

小巧精悍、准确、实用 GeoIP2 数据库

Go 7,164 207 Updated Nov 19, 2025

Quanx、Loon Surge Fileball Senplayer Yamby Hills Fowward 小幻影视 的图标绘制、部分规则、解锁检测脚本等。没有的图标留言适配即可。我也是小白,尽量把最简单的东西带给大家

JavaScript 370 27 Updated Nov 18, 2025

Telegram Bot 使用文档 API 反代

33 3 Updated Apr 11, 2022

A fast JSON parser/generator for C++ with both SAX/DOM style API

C++ 14,892 3,636 Updated Feb 5, 2025

Fast C++ logging library.

C++ 27,679 4,963 Updated Nov 17, 2025

LlamaIndex is the leading framework for building LLM-powered agents over your data.

Python 45,331 6,534 Updated Nov 19, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 20,261 3,463 Updated Nov 20, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 63,548 11,424 Updated Nov 20, 2025

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 4,287 434 Updated Nov 20, 2025

Transformer related optimization, including BERT, GPT

C++ 6,353 924 Updated Mar 27, 2024

机场推荐与机场评测

8,019 192 Updated Nov 19, 2025

A C++ library providing various concurrent data structures and reclamation schemes.

C++ 625 62 Updated Aug 7, 2025

Software Architecture with C++, published by Packt

C++ 540 167 Updated Jun 26, 2024
Next