-
flash-attention-minimal Public
Forked from tspeterkim/flash-attention-minimalFlash Attention in ~100 lines of CUDA (forward pass only)
Cuda Apache License 2.0 UpdatedOct 6, 2025 -
ktransformers Public
Forked from kvcache-ai/ktransformersA Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
Python Apache License 2.0 UpdatedApr 18, 2025 -
llm.c Public
Forked from karpathy/llm.cLLM training in simple, raw C/CUDA
Cuda MIT License UpdatedAug 1, 2024 -
Paddle Public
Forked from PaddlePaddle/PaddlePArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
C++ Apache License 2.0 UpdatedMay 10, 2024 -
-
bert_tokenization_for_java Public
Forked from zhongbin1/bert_tokenization_for_javaThis is a java version of Chinese tokenization descried in BERT.
Java Apache License 2.0 UpdatedNov 10, 2022 -
bert-for-tf2 Public
Forked from kpe/bert-for-tf2A Keras TensorFlow 2.0 implementation of BERT, ALBERT and adapter-BERT.
Python MIT License UpdatedAug 20, 2022 -
gstreamer-onnxruntime-objectdetection Public
Forked from seanmurr1/gstreamer-onnxruntime-objectdetectionC++ Other UpdatedAug 18, 2022 -
models Public
Forked from tensorflow/modelsModels and examples built with TensorFlow
Python Other UpdatedMar 16, 2022 -
TIM-VX Public
Forked from VeriSilicon/TIM-VXVerisilicon Tensor Interface Module
C MIT License UpdatedJan 14, 2022 -
-
tensorflow Public
Forked from tensorflow/tensorflowAn Open Source Machine Learning Framework for Everyone
C++ Apache License 2.0 UpdatedMar 4, 2021 -
conv_arithmetic Public
Forked from vdumoulin/conv_arithmeticA technical report on convolution arithmetic in the context of deep learning
TeX MIT License UpdatedMay 6, 2019 -
algorithms Public
Forked from jeffgerickson/algorithmsBug-tracking for Jeff's algorithms book, notes, etc.
UpdatedJan 19, 2019 -
simplyOCL Public
It is an annoying thing of preparing the openCL environment, so I wapper the initialization part of OpenCL and setting parameters for kernel
-