-
Institute of Computing Technology, CAS
-
-
LeetCUDA Public
Forked from xlite-dev/LeetCUDA📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
Cuda GNU General Public License v3.0 UpdatedNov 24, 2025 -
tiny-flash-attention Public
Forked from 66RING/tiny-flash-attentionflash attention tutorial written in python, triton, cuda, cutlass
Cuda MIT License UpdatedNov 22, 2025 -
-
flash-attention Public
Forked from Dao-AILab/flash-attentionFast and memory-efficient exact attention
Python BSD 3-Clause "New" or "Revised" License UpdatedNov 22, 2025 -
agentdojo Public
Forked from ethz-spylab/agentdojoA Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.
Python MIT License UpdatedOct 25, 2025 -
coralnpu Public
Forked from google-coral/coralnpuA machine learning accelerator core designed for energy-efficient AI at the edge.
Emacs Lisp Apache License 2.0 UpdatedOct 19, 2025 -
EE219-AICS-2024 Public
Forked from RealJustinNi/EE219-AICS-2024Jupyter Notebook UpdatedOct 10, 2025 -
-
llama2.c Public
Forked from karpathy/llama2.cInference Llama 2 in one file of pure C
C MIT License UpdatedOct 7, 2025 -
-
ai-agent-book-projects Public
Forked from bojieli/ai-agent-book-projectsPython UpdatedSep 12, 2025 -
SimuMax Public
Forked from MooreThreads/SimuMaxa static analytical model for LLM distributed training
Python Other UpdatedAug 26, 2025 -
msprof_analyze Public
Forked from chenzomi12/msprof_analyzemsprof_analyze
Python Apache License 2.0 UpdatedAug 7, 2025 -
calipers Public
Forked from microsoft/calipersCriticality-aware Framework for Modeling Computer Performance
C++ MIT License UpdatedAug 5, 2025 -
riscv-isa-sim Public
Forked from riscv-software-src/riscv-isa-simSpike, a RISC-V ISA Simulator
C Other UpdatedJul 24, 2025 -
ventus-gpgpu-isa-simulator Public
Forked from THU-DSP-LAB/ventus-gpgpu-isa-simulatorVentus GPGPU ISA Simulator Based on Spike
C Other UpdatedJul 24, 2025 -
-
-
-
vortex-HPDC Public
Forked from shaoqian2001/vortex-HPDCfor high performance dcache
Verilog Apache License 2.0 UpdatedJun 5, 2025 -
TensorGPGPU Public
Forked from yhinai/TensorGPGPURISC-V vector and tensor compute extensions for Vortex GPGPU acceleration for ML workloads. Optimized for transformer models, CNNs, and generative AI with configurable precision (FP32/16/BF16/INT8).
Verilog Apache License 2.0 UpdatedApr 25, 2025 -
nvdla-attn-mechanism Public
Hardware acceleration for transformer attention mechanisms on NVIDIA Deep Learning Accelerator (NVDLA), enabling efficient inference of transformer models with 135× better power efficiency than CPU…
-
-
-
-
libgemmini Public
Forked from ucb-bar/libgemminiGemmini extensions for Spike
C++ UpdatedMar 18, 2025 -
ONNXim Public
Forked from PSAL-POSTECH/ONNXimONNXim is a fast cycle-level simulator that can model multi-core NPUs for DNN inference
C++ MIT License UpdatedMar 16, 2025 -
-
vortex-warp-cooperative-tcore Public
Forked from BakrN/vortex