Lists (7)
Sort Name ascending (A-Z)
Stars
Making large AI models cheaper, faster and more accessible
A curated list of engineering blogs
Distributed Machine Learning Patterns from Manning Publications by Yuan Tang https://bit.ly/2RKv8Zo
A miniature library for struct-field reflection in C++
Production-grade client-side tracing, profiling, and analysis for complex software systems.
Provides very lightweight outcome<T> and result<T> (non-Boost edition)
A C++20 library for sequence-orientated programming
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Various materials about Profile Guided Optimization and other similar stuff like AutoFDO, Bolt, etc.
mimalloc is a compact general purpose allocator with excellent performance.
A machine learning compiler for GPUs, CPUs, and ML accelerators
Playing around "Less Slow" coding practices in C++ 20, C, CUDA, PTX, & Assembly, from numerics & SIMD to coroutines, ranges, exception handling, networking and user-space IO
Presentations, meetups and talks about ClickHouse
C++ implementation of a fast hash map and hash set using hopscotch hashing
Quadsort is a branchless stable adaptive mergesort faster than quicksort.
A branchless unstable quicksort / mergesort that is highly adaptive.
ClickHouse® is a real-time analytics database management system
Static reflection for enums (to string, from string, iteration) for modern C++, work with any enum type without any macro or boilerplate code
Sample codes for my CUDA programming book
CUDA by Example, written by two senior members of the CUDA software platform team, shows programmers how to employ this new technology. The authors introduce each area of CUDA development through w…
An easy-to-use framework for large scale recommendation algorithms.