Skip to content
View heatheart3's full-sized avatar

Block or report heatheart3

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)

C++ 946 82 Updated Nov 10, 2025

Infiniband Verbs Performance Tests

C 861 355 Updated Oct 27, 2025

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 49,283 8,255 Updated Dec 9, 2024

个人构建MoE大模型:从预训练到DPO的完整实践

Python 1,779 139 Updated Nov 5, 2025

RDMA core userspace libraries and daemons

C 2,014 794 Updated Nov 9, 2025

Optimized primitives for collective multi-GPU communication

C++ 4,217 1,063 Updated Nov 10, 2025

NVIDIA Inference Xfer Library (NIXL)

C++ 712 181 Updated Nov 10, 2025

A tiny yet powerful LLM inference system tailored for researching purpose. vLLM-equivalent performance with only 2k lines of code (2% of vLLM).

Python 286 35 Updated Jun 10, 2025

This is a repo for a mini distributed warehouse management system based on gRPC

Python 1 Updated Oct 12, 2025

Network Benchmarking Utility

C++ 686 128 Updated Jul 31, 2025

LITE Kernel RDMA Support for Datacenter Applications. SOSP 2017.

Objective-C 111 20 Updated Jul 9, 2020

Fastsocket is a highly scalable socket and its underlying networking implementation of Linux kernel. With the straight linear scalability, Fastsocket can provide extremely good performance in multi…

C 3,753 724 Updated Apr 19, 2018

mimalloc is a compact general purpose allocator with excellent performance.

C 12,133 1,012 Updated Nov 6, 2025

Pond: CXL-Based Memory Pooling Systems for Cloud Platforms (ASPLOS'23)

HTML 212 45 Updated Oct 13, 2024

The Artifact Evaluation Version of SOSP Paper #19

C++ 51 12 Updated Aug 19, 2024

Checkpoint/Restore tool

C 3,470 678 Updated Nov 9, 2025
C 3 Updated Aug 13, 2019

Linux kernel source tree

C 863 144 Updated Sep 4, 2025

a library version of FreeBSD's TCP/IP stack plus extras

C 766 201 Updated Feb 8, 2017

Remote Persistent Memory Access Library

C 104 55 Updated Sep 5, 2023

AIFM: High-Performance, Application-Integrated Far Memory

C 124 40 Updated Feb 28, 2023

Tools for profiling the Linux network stack.

Python 158 17 Updated Oct 21, 2022

neper is a Linux networking performance tool.

C 317 86 Updated Nov 4, 2025

libunwind official github repo (in need of new / additional maintainer, mail/open issue if interested)

C 1,151 317 Updated Oct 1, 2025

Modern HTTP benchmarking tool

C 39,743 3,025 Updated Dec 30, 2023

Main gperftools repository

C++ 8,854 1,534 Updated Oct 10, 2025

Fork from official iperf-3.1.3, and run on the dpdk user space TCP/IP stack(ANS).

C 92 34 Updated May 4, 2019

iperf3: A TCP, UDP, and SCTP network bandwidth measurement tool

C 8,007 1,379 Updated Nov 10, 2025

nettrace is a eBPF-based tool to trace network packet and diagnose network problem.

C 464 105 Updated Oct 27, 2025

BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more

C 21,928 4,015 Updated Nov 6, 2025
Next