-
University of Michigan
- https://joydddd.github.io/
- https://orcid.org/0000-0002-8855-9962
Highlights
- Pro
-
-
-
cutlass Public
Forked from NVIDIA/cutlassCUDA Templates and Python DSLs for High-Performance Linear Algebra
C++ Other UpdatedJan 6, 2026 -
cute-viz Public
Forked from NTT123/cute-vizCute layout visualization
Python MIT License UpdatedNov 18, 2025 -
kraken-1 Public
Forked from meta-pytorch/krakenTriton-based Symmetric Memory operators and examples
Python Other UpdatedSep 26, 2025 -
-
helion Public
Forked from pytorch/helionA Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.
Python BSD 3-Clause "New" or "Revised" License UpdatedAug 9, 2025 -
-
ThunderKittens Public
Forked from HazyResearch/ThunderKittensTile primitives for speedy kernels
Cuda MIT License UpdatedApr 24, 2025 -
fast.cu Public
Forked from pranjalssh/fast.cuFastest kernels written from scratch
Cuda MIT License UpdatedFeb 18, 2025 -
pytorch Public
Forked from pytorch/pytorchTensors and Dynamic neural networks in Python with strong GPU acceleration
-
attention-gym Public
Forked from meta-pytorch/attention-gymHelpful tools and examples for working with flex-attention
Python BSD 3-Clause "New" or "Revised" License UpdatedDec 12, 2024 -
-
-
mm2-gb Public
Forked from Minimap2onGPU/mm2-gbA versatile pairwise aligner for genomic and spliced nucleotide sequences
C Other UpdatedOct 6, 2024 -
bdus Public
Forked from albertofaria/bdusA framework for implementing Block Devices in User Space
C GNU General Public License v2.0 UpdatedSep 19, 2024 -
gpt-fast Public
Forked from meta-pytorch/gpt-fastSimple and efficient pytorch-native transformer text generation in <1000 LOC of python.
Python BSD 3-Clause "New" or "Revised" License UpdatedAug 14, 2024 -
-
-
hyrise Public
Forked from hyrise/hyriseHyrise is a research in-memory database.
-
-
Triton-Puzzles Public
Forked from srush/Triton-PuzzlesPuzzles for learning Triton
Jupyter Notebook Apache License 2.0 UpdatedMay 30, 2024 -
-
-
DRAMsim3 Public
Forked from umd-memsys/DRAMsim3DRAMsim3: a Cycle-accurate, Thermal-Capable DRAM Simulator
-
-
memtier_benchmark Public
Forked from RedisLabs/memtier_benchmarkNoSQL Redis and Memcache traffic generation and benchmarking tool.
-
-
onnxruntime Public
Forked from microsoft/onnxruntimeONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
C++ MIT License UpdatedJul 6, 2023 -
looppoint Public
Forked from nus-comparch/looppointSampled simulation of multi-threaded applications using LoopPoint methodology
Python UpdatedJul 6, 2023