Stars
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
A ComfyUI custom node designed for advanced image background removal and object, face, clothes, and fashion segmentation, utilizing multiple models including RMBG-2.0, INSPYRENET, BEN, BEN2, BiRefN…
intervitens / sglang
Forked from sgl-project/sglangSGLang is yet another fast serving framework for large language models and vision language models.
SGLang is a fast serving framework for large language models and vision language models.
🚀 Fast and simple Node.js version manager, built in Rust
An extremely fast Python package and project manager, written in Rust.
llama.cpp fork with additional SOTA quants and improved performance
Reliable model swapping for any local OpenAI compatible server - llama.cpp, vllm, etc
Triton-based implementation of Sparse Mixture of Experts.
woct0rdho / SageAttention
Forked from thu-ml/SageAttentionFork of SageAttention for Windows wheels and easy installation
A modern, minimalist, and elegant theme for SillyTavern. Inspired by moonlit nights and gentle echoes of serenity.
GameStream client for PCs (Windows, Mac, Linux, and Steam Link)
Self-hosted game stream host for Moonlight.
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model w/CPU ONNX and NVIDIA GPU PyTorch support, handling, and auto-stitching
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs
A simple command line tool to overclock Nvidia GPUs using the NVML library on Linux. This supports both X11 and Wayland.
Scripts to control NVIDIA GPUs using NVML API
Wan: Open and Advanced Large-Scale Video Generative Models
For unloading a model or all models, using the memory management that is already present in ComfyUI. Copied from https://github.com/willblaschko/ComfyUI-Unload-Models but without the unnecessary ex…
Nodes related to video workflows
A set of nodes to edit videos using the Hunyuan Video model
kingbri1 / flash-attention
Forked from Dao-AILab/flash-attentionFast and memory-efficient exact attention
sgsdxzy / YuE-exllamav2-fork
Forked from AlpinDale/Better-YuEYuE: Open Full-song Generation Foundation Model, something similar to Suno.ai but open