Skip to content
View alohali's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report alohali

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Ahead of Time (AOT) Triton Math Library

Python 86 35 Updated Dec 12, 2025

Fast and memory-efficient exact attention

Python 207 69 Updated Jan 9, 2026

Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators

C++ 505 261 Updated Jan 11, 2026

Additional utils and helpers to extend TensorFlow when build recommendation systems, contributed and maintained by SIG Recommenders.

Cuda 630 142 Updated Sep 4, 2025

DeepRec is a high-performance recommendation deep learning framework based on TensorFlow. It is hosted in incubation in LF AI & Data Foundation.

C++ 1,156 361 Updated Jan 21, 2025

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 9,088 1,618 Updated Jan 9, 2026

BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.

C++ 912 169 Updated Dec 30, 2024

Yet another SIP003 plugin based on IETF-QUIC

Rust 136 20 Updated Sep 6, 2025

An Open Source Machine Learning Framework for Everyone

C++ 1 1 Updated Aug 4, 2021

A primitive library for neural network

C++ 1,369 223 Updated Nov 24, 2024

An industrial deep learning framework for high-dimension sparse data

PureBasic 4,306 1,028 Updated Sep 25, 2024

Keep writing

HTML 1 Updated Oct 7, 2018

NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

Python 1,135 148 Updated Oct 23, 2025

benchmark models for TNN, ncnn, MNN

Shell 20 7 Updated Jun 10, 2020

A benchmark framework for Tensorflow

Python 1,145 633 Updated Oct 6, 2023

TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its…

C++ 4,608 772 Updated May 9, 2025

HugeCTR is a high efficiency GPU framework designed for Click-Through-Rate (CTR) estimating training

C++ 1,040 205 Updated Sep 15, 2025

Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

Python 14,647 2,256 Updated Dec 1, 2025

A list of ICs and IPs for AI, Machine Learning and Deep Learning.

PHP 1,701 278 Updated Jun 5, 2024

5-Segment Pipeline MIPS CPU

Verilog 6 1 Updated Mar 31, 2017

A tool which profiles OpenCL devices to find their peak capacities

C++ 478 125 Updated Dec 3, 2025

row-major matmul optimization

C++ 698 94 Updated Aug 20, 2025

Low-latency machine code generation

C++ 4,401 563 Updated Jan 3, 2026

腾讯优图高精度双分支人脸检测器

Python 2,966 726 Updated Nov 13, 2025

ncnn is a high-performance neural network inference framework optimized for the mobile platform

C++ 22,594 4,374 Updated Jan 11, 2026

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 96,518 26,483 Updated Jan 11, 2026

Caffe: a fast open framework for deep learning.

C++ 667 260 Updated Apr 3, 2023

Caffe: a fast open framework for deep learning.

C++ 34,799 18,575 Updated Jul 31, 2024

Convolutional neural networks C++ framework with CPU and GPU (CUDA) backends

C++ 182 44 Updated Dec 7, 2018
Next