Skip to content
View neiltian-tencent's full-sized avatar

Block or report neiltian-tencent

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

SGLang is a fast serving framework for large language models and vision language models.

Python 20,455 3,536 Updated Nov 27, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,932 286 Updated May 15, 2025

Semantic cache for LLMs. Fully integrated with LangChain and llama_index.

Python 7,851 568 Updated Jul 11, 2025

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 2,282 294 Updated Nov 27, 2025

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 95,422 26,030 Updated Nov 27, 2025

Development repository for the Triton language and compiler

MLIR 17,685 2,411 Updated Nov 27, 2025

OneDiff: An out-of-the-box acceleration library for diffusion models.

Jupyter Notebook 1,948 125 Updated May 8, 2025

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

C++ 18,491 3,561 Updated Nov 27, 2025

Making large AI models cheaper, faster and more accessible

Python 41,273 4,541 Updated Nov 24, 2025

OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.

C++ 9,376 1,011 Updated Aug 20, 2025

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 40,841 4,650 Updated Nov 26, 2025

TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its…

C++ 4,593 772 Updated May 9, 2025

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

Python 4,694 382 Updated Oct 27, 2025

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 12,417 2,277 Updated Nov 12, 2025

ncnn is a high-performance neural network inference framework optimized for the mobile platform

C++ 22,332 4,354 Updated Nov 26, 2025

Transformer related optimization, including BERT, GPT

C++ 6,354 923 Updated Mar 27, 2024

Visualizer for neural network, deep learning and machine learning models

JavaScript 31,884 3,033 Updated Nov 27, 2025