ShaomaoLin94

Follow

ShaomaoLin94

Follow

1 follower · 2 following

Popular repositories Loading

AITemplate AITemplate Public

Forked from facebookincubator/AITemplate

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

Python
XNNPACK XNNPACK Public

Forked from google/XNNPACK

High-efficiency floating-point neural network inference operators for mobile, server, and Web

C
distributed-llama distributed-llama Public

Forked from b4rtaz/distributed-llama

Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.

C++