-
网商路电子厂
- Hangzhou
-
22:23
(UTC +08:00)
Stars
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Aiming to integrate most existing feature caching-based diffusion acceleration schemes into a unified framework.
Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache).
[ICCV2025] From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers
FlashInfer: Kernel Library for LLM Serving
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"
Standardized Distributed Generative and Predictive AI Inference Platform for Scalable, Multi-Framework Deployment on Kubernetes
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
Fast and memory-efficient exact attention
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
free C++ class library of cryptographic schemes
🌚 🌍 🌝 GeoIP 规则文件加强版,支持自行定制 V2Ray dat 格式文件 geoip.dat、MaxMind mmdb 格式文件、sing-box SRS 格式文件、mihomo MRS 格式文件、Clash ruleset、Surge ruleset 等。Enhanced edition of GeoIP files for V2Ray, Xray-core, sing-box,…
Quanx、Loon Surge Fileball Senplayer Yamby Hills Fowward 小幻影视 的图标绘制、部分规则、解锁检测脚本等。没有的图标留言适配即可。我也是小白,尽量把最简单的东西带给大家
A fast JSON parser/generator for C++ with both SAX/DOM style API
LlamaIndex is the leading framework for building LLM-powered agents over your data.
SGLang is a fast serving framework for large language models and vision language models.
A high-throughput and memory-efficient inference and serving engine for LLMs
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
Transformer related optimization, including BERT, GPT
A C++ library providing various concurrent data structures and reclamation schemes.
Software Architecture with C++, published by Packt