Skip to content
View masahi's full-sized avatar

Organizations

@apache @dmlc @octoml

Block or report masahi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 4,259 349 Updated Dec 19, 2025

CUDA/Metal accelerated language model inference

C 625 30 Updated May 29, 2025

RPyC (Remote Python Call) - A transparent and symmetric RPC library for python

Python 1,682 251 Updated Aug 14, 2025

📚 Jupyter notebook tutorials for OpenVINO™

Jupyter Notebook 2,978 948 Updated Dec 19, 2025

Embree ray tracing kernels repository.

C++ 2,616 420 Updated Dec 17, 2025

Universal LLM Deployment Engine with ML Compilation

Python 21,765 1,890 Updated Dec 11, 2025

Build system, successor to Buck

Rust 4,187 312 Updated Dec 19, 2025

MoonRay is DreamWorks’ open-source, award-winning, state-of-the-art production MCRT renderer.

CMake 4,526 280 Updated Nov 14, 2025

optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052

C++ 476 37 Updated Mar 15, 2024

Language Modeling with the H3 State Space Model

Assembly 521 52 Updated Sep 29, 2023

An open-source efficient deep learning framework/compiler, written in python.

Python 737 68 Updated Sep 4, 2025

An efficient vector-graphics renderer

Rust 2,642 56 Updated May 16, 2023

A GPU compute-centric 2D renderer.

Rust 3,607 205 Updated Dec 19, 2025

A modern cross-platform low-level graphics library and rendering framework

Batchfile 4,104 372 Updated Dec 18, 2025

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

Python 4,696 383 Updated Dec 17, 2025

Real-time GPU path tracing with an OpenUSD Hydra render delegate

C++ 590 45 Updated Aug 8, 2025

This is the development repository for the OpenFHE library. The current version is 1.4.2 (released on October 20, 2025).

C++ 1,048 263 Updated Dec 18, 2025

3D fluid simulation experiments in Rust, using WebGPU-rs (WIP)

Rust 466 16 Updated Dec 17, 2022
HLSL 459 70 Updated Sep 16, 2025
Python 52 8 Updated Mar 29, 2023

A STARK prover and verifier for arbitrary computations

Rust 882 220 Updated Jul 19, 2025

The Flutter engine

C++ 7,580 5,992 Updated Feb 25, 2025

A General-purpose Task-parallel Programming System using Modern C++

C++ 11,489 1,343 Updated Dec 19, 2025

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime

Python 2,549 286 Updated Dec 19, 2025

Single C file, Realtime CPU/GPU Profiler with Remote Web Viewer

C 3,273 281 Updated Aug 28, 2024

Vulkan and rust experiments, including a spectral path tracer using Vulkan ray tracing extensions

Rust 131 5 Updated Sep 13, 2025

Instant neural graphics primitives: lightning fast NeRF and more

Cuda 17,148 2,038 Updated Dec 14, 2025

magic-trace collects and displays high-resolution traces of what a process is doing

OCaml 5,181 115 Updated Dec 12, 2025

3D engine with modern graphics

C 6,718 705 Updated Dec 19, 2025

Open Machine Learning Compiler Framework

Python 12,937 3,739 Updated Dec 18, 2025
Next