Stars
High-performance safetensors model loader
LeaderWorkerSet: An API for deploying a group of pods as a unit of replication
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…
阿布量化交易系统(股票,期权,期货,比特币,机器学习) 基于python的开源量化交易,量化投资架构
Supercharge Your LLM with the Fastest KV Cache Layer
vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization
AIInfra(AI 基础设施)指AI系统从底层芯片等硬件,到上层软件栈支持AI大模型训练和推理。
Achieve state of the art inference performance with modern accelerators on Kubernetes
Injecting Adrenaline into LLM Serving: Boosting Resource Utilization and Throughput via Attention Disaggregation
A Datacenter Scale Distributed Inference Serving Framework
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
A high-throughput and memory-efficient inference and serving engine for LLMs
zgsm-ai / costrict
Forked from RooCodeInc/Roo-CodeCostrict - strict AI coder for enterprises, quality first, including AI Agent, AI CodeReview, AI Completion.
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
心理健康大模型 (LLM x Mental Health), Pre & Post-training & Dataset & Evaluation & Depoly & RAG, with InternLM / Qwen / Baichuan / DeepSeek / Mixtral / LLama / GLM series models
[ICML 2022 / ICLR 2024] Source code for our papers "Plug & Play Attacks: Towards Robust and Flexible Model Inversion Attacks" and "Be Careful What You Smooth For".
A tutorial of how to integrate Stripe Payments with Django
FastAPI + vue3 前后端分离后台管理系统,包含PC端,微信小程序端。接口使用:FastAPI+Pydantic+SQLAlchemy 2.0+Mysql,PC 端使用:Vue3+Typescript+Vite+Element Plus,小程序使用:Uni-APP + uview ui。异步存储,RBAC 权限管理,定时任务,部门管理等功能。
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
An Improved Langchain RAG Tutorial (v2) with local LLMs, database updates, and testing.
[NeurIPS 2022] Denoising Diffusion Restoration Models -- Official Code Repository
[ECCV 2024] codes of DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior
A latent text-to-image diffusion model
Together Mixture-Of-Agents (MoA) – 65.1% on AlpacaEval with OSS models
llama3 implementation one matrix multiplication at a time