-
Chinese Academy of Sciences
- Beijing, China
-
09:41
(UTC +08:00) - https://riverclouds.net/
Highlights
- Pro
Lists (5)
Sort Name ascending (A-Z)
Stars
https://wavespeed.ai/ Context parallel attention that accelerates DiT model inference with dynamic caching
Tile-Based Runtime for Ultra-Low-Latency LLM Inference
Low overhead tracing library and trace visualizer for pipelined CUDA kernels
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…
Autonomous GPU Kernel Generation via Deep Agents
Draft-Target Disaggregation LLM Serving System via Parallel Speculative Decoding.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
verl: Volcano Engine Reinforcement Learning for LLMs
Read-only mirror of https://git.zx2c4.com/cgit/about . Pull requests and issues on GitHub cannot be accepted and will be automatically closed. The proper way to submit changes is via the mailing li…
Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
macOS Adobe apps download & installer
jumploop / uv-custom
Forked from Wangnov/uv-custom这是一个与 astral-sh/uv 官方版本同步的镜像项目,旨在为国内用户提供更快速、更稳定的 uv 安装和使用体验。
Artifact from "Hardware Compute Partitioning on NVIDIA GPUs". THIS IS A FORK OF BAKITAS REPO. I AM NOT ONE OF THE AUTHORS OF THE PAPER.
ColorBrewer color schemes for gnuplot
Venus Collective Communication Library, supported by SII and Infrawaves.
chat log tool, easily use your own chat data. 聊天记录工具,轻松使用自己的聊天数据
Open-source observability for your GenAI or LLM application, based on OpenTelemetry
Empowering LLM Agents for Real-World Computer System Optimization
AgentNetworkProtocol(ANP) is an open source protocol for agent communication. Our vision is to define how agents connect with each other, building an open, secure, and efficient collaboration netwo…
Checkpoint-engine is a simple middleware to update model weights in LLM inference engines
NVIDIA NVSHMEM is a parallel programming interface for NVIDIA GPUs based on OpenSHMEM. NVSHMEM can significantly reduce multi-process communication and coordination overheads by allowing programmer…
Production-grade client-side tracing, profiling, and analysis for complex software systems.