Skip to content
View 0ct0cat's full-sized avatar
:octocat:
:octocat:

Block or report 0ct0cat

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

GPGPU-Sim provides a detailed simulation model of contemporary NVIDIA GPUs running CUDA and/or OpenCL workloads. It includes support for features such as TensorCores and CUDA Dynamic Parallelism as…

C++ 1,503 599 Updated Feb 15, 2025

C/C++ hooks to integrate with pre-commit

Python 374 80 Updated Mar 20, 2024

A simple C++11 Thread Pool implementation

C++ 8,586 2,347 Updated Jul 20, 2024

🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022

Jupyter Notebook 9,435 997 Updated Feb 5, 2025

Inpaint anything using Segment Anything and inpainting models.

Jupyter Notebook 7,511 648 Updated Feb 29, 2024

Voyager is an interactive RGBD video generation model conditioned on camera input, and supports real-time 3D reconstruction.

Python 1,394 137 Updated Oct 22, 2025

A connection-oriented persistent message queue framework based on TCP or SHM(shared memory)

C++ 471 139 Updated May 4, 2020

torchview: visualize pytorch models

Python 1,004 48 Updated May 18, 2025

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Python 12,994 863 Updated Dec 17, 2024

超级速查表 - 编程语言、框架和开发工具的速查表,单个文件包含一切你需要知道的东西 ⚡

Shell 12,329 2,117 Updated Nov 12, 2025

C++ extensions in PyTorch

Python 1,164 248 Updated Jul 8, 2025

Arm Mbed OS is a platform operating system designed for the internet of things

C 4,797 3,021 Updated Oct 8, 2024

Code release for book "Efficient Training in PyTorch"

Python 112 17 Updated Apr 10, 2025

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

Cuda 8,636 848 Updated Nov 6, 2025

A tutorial for CUDA&PyTorch

C++ 169 33 Updated Jan 21, 2025

🎨 ML Visuals contains figures and templates which you can reuse and customize to improve your scientific writing.

16,082 1,498 Updated Feb 13, 2023

RDMA core userspace libraries and daemons

C 2,037 801 Updated Nov 26, 2025

Experimental GStreamer plugin for encrypting / decrypting H264 streams with AES

C 8 1 Updated Sep 7, 2024

A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology

C++ 1,281 177 Updated Aug 19, 2025

Curated list of datasets and tools for post-training.

4,019 331 Updated Nov 10, 2025

通义千问的DPO训练

Jupyter Notebook 60 6 Updated Sep 21, 2024

cnn

Python 134 23 Updated Sep 8, 2019

A collection of modern/faster/saner alternatives to common unix commands.

32,602 819 Updated Sep 10, 2024

PyTorch distributed training from scratch (for educational purposes only)

Python 19 3 Updated Apr 12, 2025

《Pytorch实用教程》(第二版)无论是零基础入门,还是CV、NLP、LLM项目应用,或是进阶工程化部署落地,在这里都有。相信在本书的帮助下,读者将能够轻松掌握 PyTorch 的使用,成为一名优秀的深度学习工程师。

Jupyter Notebook 4,273 465 Updated Jan 27, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 20,455 3,536 Updated Nov 27, 2025

deepstream_tools will serve as a parent repo to hold various tools to be released for DeepStream SDK.

Python 23 4 Updated Nov 5, 2025

🗂️A file list/WebDAV program that supports multiple storages, powered by Gin and Solidjs. / 一个支持多存储的文件列表/WebDAV程序,使用 Gin 和 Solidjs。

Go 48,577 8,001 Updated Nov 15, 2025

CVPR 2025 论文和开源项目合集

21,530 2,772 Updated Jul 2, 2025

flash attention tutorial written in python, triton, cuda, cutlass

Cuda 453 50 Updated May 14, 2025
Next