Skip to content
View YouSenRong's full-sized avatar
💻
coding
💻
coding

Block or report YouSenRong

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A high-performance inference engine for LLMs, optimized for diverse AI accelerators.

C++ 704 77 Updated Nov 16, 2025

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

Cuda 8,462 835 Updated Nov 6, 2025

A Datacenter Scale Distributed Inference Serving Framework

Rust 5,477 690 Updated Nov 16, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,927 287 Updated May 15, 2025

An open-source C++ library developed and used at Facebook.

C++ 30,057 5,803 Updated Nov 16, 2025

High-Performance C++ Fundamental Library

C++ 613 89 Updated Nov 9, 2025

A toolkit to run Ray applications on Kubernetes

Go 2,133 651 Updated Nov 15, 2025

A curated list of awesome C++ (or C) frameworks, libraries, resources, and shiny things. Inspired by awesome-... stuff.

67,755 8,150 Updated Nov 4, 2025

C++ implementation of Raft core logic as a replication library

C++ 1,142 270 Updated Nov 14, 2025

common in-memory tensor structure

C++ 1,098 155 Updated Oct 11, 2025

Deep Learning Deployment Framework: Supports tf/torch/trt/trtllm/vllm and other NN frameworks. Support dynamic batching, and streaming modes. It is dual-language compatible with Python and C++, off…

C++ 167 10 Updated May 8, 2025

cube studio开源云原生一站式机器学习/深度学习/大模型AI平台,mlops算法链路全流程,算力租赁平台,notebook在线开发,拖拉拽任务流pipeline编排,多机多卡分布式训练,超参搜索,推理服务VGPU虚拟化,边缘计算,标注平台自动化标注,deepseek等大模型sft微调/奖励模型/强化学习训练,vllm/ollama/mindie大模型多机推理,私有知识库,AI模型市场…

Python 4,683 822 Updated Nov 7, 2025

HierarchicalKV is a part of NVIDIA Merlin and provides hierarchical key-value storage to meet RecSys requirements. The key capability of HierarchicalKV is to store key-value feature-embeddings on h…

Cuda 176 30 Updated Nov 2, 2025

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.

Python 48,316 3,968 Updated Nov 15, 2025

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Python 33,975 3,247 Updated Nov 16, 2025

Cross-platform, customizable multimedia/video processing framework. With strong GPU acceleration, heterogeneous design, multi-language support, easy to use, multi-framework compatible and high perf…

C++ 987 103 Updated Oct 31, 2025

LLM101n: Let's build a Storyteller

35,557 1,935 Updated Aug 1, 2024

AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Jupyter Notebook 15,614 2,240 Updated Sep 3, 2025

A unified library of state-of-the-art model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment…

Python 1,537 193 Updated Nov 16, 2025

Get up and running with OpenAI gpt-oss, DeepSeek-R1, Gemma 3 and other models.

Go 156,036 13,653 Updated Nov 16, 2025

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

C++ 27,617 8,823 Updated Nov 12, 2025

Guidelines Support Library

C++ 6,561 760 Updated Nov 3, 2025

一款简单易用和高性能的AI部署框架 | An Easy-to-Use and High-Performance AI Deployment Framework

C++ 1,504 182 Updated Nov 14, 2025

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT

Python 2,884 369 Updated Nov 16, 2025

High performance server-side application framework

C++ 8,968 1,649 Updated Nov 13, 2025

A cheatsheet of modern C++ language and library features.

21,238 2,240 Updated Apr 5, 2025

Common utilities for ONNX converters

Python 284 68 Updated Sep 4, 2025

DeepRec is a high-performance recommendation deep learning framework based on TensorFlow. It is hosted in incubation in LF AI & Data Foundation.

C++ 1,145 357 Updated Jan 21, 2025

High-performance Inference and Deployment Toolkit for LLMs and VLMs based on PaddlePaddle

Python 3,575 660 Updated Nov 15, 2025
Next