Stars
Demystifying Datapath Accelerator Enhanced Off-path SmartNIC [ICNP24]
Example of multi-process, multi-GPU training using Torch-parallel, nVidia-nccl, and nVidia-MPS
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
PEAKS: Power Efficiency Aware Kubernetes Scheduler
GTNS is a discrete-event network simulator targeted primarily for research and educational use. GTNS is written in Visual C++ programming language and supports different network topologies. This si…
Kubernetes training from basics to advanced
FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and federated learning. FEDML Launch, a cross-cloud scheduler, further enables running any AI jobs o…
A library for Partially Homomorphic Encryption in Python
Declarative cluster management using constraint programming, where constraints are described using SQL.
AWS virtual gpu device plugin provides capability to use smaller virtual gpus for your machine learning inference workloads
Reference implementations of MLPerf® training benchmarks
Determined is an open-source machine learning platform that simplifies distributed training, hyperparameter tuning, experiment tracking, and resource management. Works with PyTorch and TensorFlow.
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
YOLO3D: End-to-end real-time 3D Oriented Object Bounding Box Detection from LiDAR Point Cloud (ECCV 2018)
Open API for IP Applications to Offload TCP/UDP Session Packet Processing to Hardware