Stars
A retargetable MLIR-based machine learning compiler and runtime toolkit.
A Convolutional Neural Network Accelerator implementation on FPGA, xilinx (xczu7ev-ffvc1156-2-i), The inference of yolov8 took 60ms.
The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.
An MLIR-based compiler framework bridges DSLs (domain-specific languages) to DSAs (domain-specific architectures).
X-KANeRF [KANeRF-benchmarking]: KAN based NeRF with various basis functions like B-Splines, Fourier, Gaussians, Wavelets, Polynomials, etc
🔍大模型应用开发实战一:RAG 技术全栈指南,在线阅读地址:https://datawhalechina.github.io/all-in-rag/
A high-throughput and memory-efficient inference and serving engine for LLMs
LLVM Techniques, Tips, and Best Practices Clang and Middle-End Libraries, published by Packt
A machine learning accelerator core designed for energy-efficient AI at the edge.
An efficient pure-PyTorch implementation of Kolmogorov-Arnold Network (KAN).
TJ4DRadSet: A 4D Radar Dataset for Autonomous Driving
PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space
PointNet and PointNet++ implemented by pytorch (pure python) and on ModelNet, ShapeNet and S3DIS.
[RAL 2025]. SGDet3D: Semantics and Geometry Fusion for 3D Object Detection Using 4D Radar and Camera.
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
GNU toolchain for RISC-V, including GCC
Multi-channel AI proxy with intelligent key rotation. 智能密钥轮询的多渠道 AI 代理。
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
This project extends the idea of the innovative architecture of Kolmogorov-Arnold Networks (KAN) to the Convolutional Layers, changing the classic linear transformation of the convolution to learna…
CUDA Templates and Python DSLs for High-Performance Linear Algebra
An all-in-one toolkit for LeagueClient. Gathering power 🚀.
Development repository for the Triton language and compiler
Fast and memory-efficient exact attention