🏠
Working from home
Stars
Exploring the scalable matrix extension of the Apple M4 processor
Diffusion model(SD,Flux,Wan,Qwen Image,Z-Image,...) inference in pure C/C++
IREE's PyTorch Frontend, based on Torch Dynamo.
llama3 implementation one matrix multiplication at a time
Collaborative Collection of C++ Best Practices. This online resource is part of Jason Turner's collection of C++ Best Practices resources. See README.md for more information.
Working draft of the proposed RISC-V V vector extension
Numerical software development homework for 21SP