Skip to content
View kabicm's full-sized avatar

Block or report kabicm

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
C++ 8 7 Updated Dec 28, 2025

PyTorch training at CSCS

Jupyter Notebook 19 16 Updated Jul 4, 2025

Compression for Foundation Models

Jupyter Notebook 35 5 Updated Jul 21, 2025

Dirigent: Lightweight Serverless Orchestration

Go 41 5 Updated Aug 26, 2025

CUDA benchmarks for measuring GPU utilization and interference

Cuda 13 1 Updated Feb 11, 2025

A prototype of using ibis-substrait to compile against a substrait extension

Python 2 Updated Apr 11, 2023

Distributed Communication-Optimal LU-factorization Algorithm

C++ 12 4 Updated Aug 1, 2021

A cross platform way to express data transformation, relational algebra, standardized record expression and plans.

Python 1,443 189 Updated Dec 25, 2025

RMG is an Open Source code for electronic structure calculations and modeling of materials and molecules. It is based on density functional theory and uses a real space basis and pseudopotentials.

C++ 55 16 Updated Jan 1, 2026

Neovim config for the lazy

Lua 24,445 1,694 Updated Nov 11, 2025

Spiking neuron integration for PyTorch

Python 41 5 Updated Mar 18, 2025

Google Research

Jupyter Notebook 36,975 8,283 Updated Dec 31, 2025
Python 77 6 Updated May 4, 2021

Zero-copy MPI communication of JAX arrays, for turbo-charged HPC applications in Python ⚡

Python 507 32 Updated Dec 19, 2025
Jupyter Notebook 63 3 Updated Mar 4, 2022

Extending JAX with custom C++ and CUDA code

Python 402 23 Updated Aug 18, 2024

Long Range Arena for Benchmarking Efficient Transformers

Python 771 84 Updated Dec 16, 2023

Flax is a neural network library for JAX that is designed for flexibility.

Jupyter Notebook 6,999 773 Updated Dec 24, 2025

Making large AI models cheaper, faster and more accessible

Python 41,314 4,544 Updated Dec 22, 2025

PyTorch3D is FAIR's library of reusable components for deep learning with 3D data

Python 9,724 1,431 Updated Dec 17, 2025

Model parallel transformers in JAX and Haiku

Python 6,358 888 Updated Jan 21, 2023

Fast and memory-efficient exact attention

Python 21,391 2,258 Updated Jan 1, 2026

Training and serving large-scale neural networks with auto parallelization.

Python 3,173 357 Updated Dec 9, 2023

Code for "Heterogenity-Aware Cluster Scheduling Policies for Deep Learning Workloads", which appeared at OSDI 2020

Jupyter Notebook 136 34 Updated Jul 25, 2024

Trax — Deep Learning with Clear Code and Speed

Python 8,298 828 Updated Sep 26, 2025

ML-Perf HPC WG Implementation of Mesh-Tensorflow and (buildscripts) for Tensorflow with MPI

Python 4 1 Updated Oct 18, 2019

Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.

C++ 32 11 Updated Apr 2, 2025

Distributed Communication-Optimal Shuffle and Transpose Algorithm

C++ 14 5 Updated May 6, 2025

Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm

C++ 212 31 Updated Dec 4, 2025
Next