Stars
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Development repository for the Triton language and compiler
A framework for few-shot evaluation of language models.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…