zhooooong

StonyPort zhooooong

1 follower · 4 following

网商路电子厂
Hangzhou
22:23 (UTC +08:00)

Achievements

Stars

comfyanonymous / ComfyUI

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

Python 94,066 10,626 Updated Nov 20, 2025

Shenyi-Z / Cache4Diffusion

Aiming to integrate most existing feature caching-based diffusion acceleration schemes into a unified framework.

Python 78 8 Updated Oct 23, 2025

maomaocun / dLLM-cache

Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache).

Python 184 14 Updated Nov 17, 2025

Shenyi-Z / TaylorSeer

[ICCV2025] From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers

Python 330 22 Updated Aug 11, 2025

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 4,106 573 Updated Nov 20, 2025

vllm-project / llm-compressor

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 2,253 288 Updated Nov 20, 2025

IST-DASLab / gptq

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".

Python 2,221 186 Updated Mar 27, 2024

onnx / optimizer

ONNX Optimizer

C++ 774 102 Updated Nov 2, 2025

IDEA-Research / DINO

[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"

Python 2,673 294 Updated Jul 31, 2024

kserve / kserve

Standardized Distributed Generative and Predictive AI Inference Platform for Scalable, Multi-Framework Deployment on Kubernetes

Shell 4,806 1,308 Updated Nov 20, 2025

NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 12,390 2,273 Updated Nov 12, 2025

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 20,641 2,149 Updated Nov 19, 2025

triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Python 10,057 1,674 Updated Nov 20, 2025

NVIDIA / TensorRT-LLM

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

C++ 12,190 1,881 Updated Nov 20, 2025

huggingface / transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 152,775 31,187 Updated Nov 20, 2025

weidai11 / cryptopp

free C++ class library of cryptographic schemes

C++ 5,351 1,655 Updated Aug 1, 2024

Loyalsoldier / geoip

🌚 🌍 🌝 GeoIP 规则文件加强版，支持自行定制 V2Ray dat 格式文件 geoip.dat、MaxMind mmdb 格式文件、sing-box SRS 格式文件、mihomo MRS 格式文件、Clash ruleset、Surge ruleset 等。Enhanced edition of GeoIP files for V2Ray, Xray-core, sing-box,…

Go 5,267 800 Updated Nov 20, 2025