-
The Institute of Computing Technology of the Chinese Academy of Sciences
- Beijing P.R. China
- http://blog.amalcao.me
Stars
- All languages
- ANTLR
- Arduino
- Assembly
- Brainfuck
- C
- C#
- C++
- CMake
- CSS
- Clojure
- Common Lisp
- Cuda
- Elixir
- Erlang
- Go
- HTML
- Haskell
- Java
- JavaScript
- Julia
- Jupyter Notebook
- Kotlin
- LLVM
- Lua
- MATLAB
- MLIR
- Makefile
- Markdown
- Nim
- OCaml
- Objective-C
- Objective-C++
- PHP
- Python
- QML
- Racket
- Raku
- RobotFramework
- Rocq Prover
- Ruby
- Rust
- Sass
- Scala
- Scheme
- ShaderLab
- Shell
- Swift
- TeX
- TypeScript
- V
- Verilog
- Vim Script
- WebAssembly
Distributed Compiler based on Triton for Parallel Systems
Tensor Compute Primitives: Mid-level Intermediate Representation for Machine Learning Programs
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
A modern model graph visualizer and debugger
💯2026年 软件设计师 (软考中级)备考资源库+配套免费刷题软件。https://ruankaodaren.com
KernelBench: Can LLMs Write GPU Kernels? - Benchmark + Toolkit with Torch -> CUDA (+ more DSLs)
Shared Middle-Layer for Triton Compilation
C/C++ frontend for MLIR. Also features polyhedral optimizations, parallel optimizations, and more!
This is the top-level repository for the Accel-Sim framework.
A library to benchmark CUDA code, similar to google benchmark.
collection of benchmarks to measure basic GPU capabilities
A GPU benchmark suite for assessing on-chip GPU memory bandwidth
Universal LLM Deployment Engine with ML Compilation
BlackHole is a modern macOS audio loopback driver that allows applications to pass audio to other applications with zero additional latency.
PennyLane is a cross-platform Python library for quantum computing, quantum machine learning, and quantum chemistry. Built by researchers, for research.
Heron: Automatically Constrained High-Performance Library Generation for Deep Learning Accelerators
This is a list of awesome edgeAI inference related papers.
Must read research papers and links to tools and datasets that are related to using machine learning for compilers and systems optimisation
Polyhedral Parallel Code Generation (source repository: http://repo.or.cz/ppcg.git)
Transformer related optimization, including BERT, GPT