Lists (4)
Sort Name ascending (A-Z)
Stars
GPGPU-Sim provides a detailed simulation model of contemporary NVIDIA GPUs running CUDA and/or OpenCL workloads. It includes support for features such as TensorCores and CUDA Dynamic Parallelism as…
A simple C++11 Thread Pool implementation
🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022
Inpaint anything using Segment Anything and inpainting models.
Voyager is an interactive RGBD video generation model conditioned on camera input, and supports real-time 3D reconstruction.
A connection-oriented persistent message queue framework based on TCP or SHM(shared memory)
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
超级速查表 - 编程语言、框架和开发工具的速查表,单个文件包含一切你需要知道的东西 ⚡
Arm Mbed OS is a platform operating system designed for the internet of things
Code release for book "Efficient Training in PyTorch"
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
🎨 ML Visuals contains figures and templates which you can reuse and customize to improve your scientific writing.
Experimental GStreamer plugin for encrypting / decrypting H264 streams with AES
A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology
Curated list of datasets and tools for post-training.
A collection of modern/faster/saner alternatives to common unix commands.
PyTorch distributed training from scratch (for educational purposes only)
《Pytorch实用教程》(第二版)无论是零基础入门,还是CV、NLP、LLM项目应用,或是进阶工程化部署落地,在这里都有。相信在本书的帮助下,读者将能够轻松掌握 PyTorch 的使用,成为一名优秀的深度学习工程师。
SGLang is a fast serving framework for large language models and vision language models.
deepstream_tools will serve as a parent repo to hold various tools to be released for DeepStream SDK.
🗂️A file list/WebDAV program that supports multiple storages, powered by Gin and Solidjs. / 一个支持多存储的文件列表/WebDAV程序,使用 Gin 和 Solidjs。
flash attention tutorial written in python, triton, cuda, cutlass