seb-sep

Sebastian Sepulveda seb-sep

inference @ppl-ai

15 followers · 11 following

Perplexity AI
San Francisco
in/sebastianmsepulveda

Achievements

Starred repositories

bertmaher / simplegemm

Cuda 123 16 Updated Oct 22, 2025

ashvardanian / less_slow.cpp

Playing around "Less Slow" coding practices in C++ 20, C, CUDA, PTX, & Assembly, from numerics & SIMD to coroutines, ranges, exception handling, networking and user-space IO

C++ 1,871 75 Updated Sep 10, 2025

HazyResearch / ThunderKittens

Tile primitives for speedy kernels

Cuda 2,877 194 Updated Nov 9, 2025

dropbox / gemlite

Fast low-bit matmul kernels in Triton

Python 393 29 Updated Oct 26, 2025

JakeTrock / gosblk

lsblk in go for apple computers

Go 10 Updated Nov 3, 2024

MattPD / cpplinks

A categorized list of C++ resources.

5,137 524 Updated Nov 10, 2025

facebookresearch / lingua

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,732 272 Updated Jul 18, 2025

vosen / ZLUDA

CUDA on non-NVIDIA GPUs

Rust 13,411 849 Updated Nov 10, 2025

KhronosGroup / SPIRV-Cross

SPIRV-Cross is a practical tool and library for performing reflection on SPIR-V and disassembling SPIR-V back to high level languages.

GLSL 2,311 616 Updated Nov 7, 2025

zml / zml

Any model. Any hardware. Zero compromise. Built with @ziglang / @openxla / MLIR / @bazelbuild

Zig 2,859 112 Updated Nov 10, 2025

aiola-lab / whisper-medusa

Whisper with Medusa heads

Python 863 53 Updated Aug 6, 2025

exaloop / codon

A high-performance, zero-overhead, extensible Python compiler with built-in NumPy support

Python 16,014 558 Updated Nov 10, 2025

seatedro / glyph

convert images, video to ascii!

Zig 489 24 Updated Sep 2, 2025

philipturner / metal-flash-attention

FlashAttention (Metal Port)

Swift 549 34 Updated Sep 22, 2024

hollance / neural-engine

Everything we actually know about the Apple Neural Engine (ANE)

2,309 85 Updated Oct 21, 2025

philipturner / metal-benchmarks

Apple GPU microarchitecture

Metal 559 27 Updated Sep 22, 2024

karpathy / LLM101n

LLM101n: Let's build a Storyteller

35,504 1,933 Updated Aug 1, 2024

linkedin / Liger-Kernel

Efficient Triton Kernels for LLM Training

Python 5,818 428 Updated Nov 8, 2025

regrettable-username / llm.metal

Forked from karpathy/llm.c

LLM training in simple, raw C/Metal Shading Language

Cuda 60 4 Updated Apr 24, 2024

fla-org / flash-linear-attention

🚀 Efficient implementations of state-of-the-art linear attention models

Python 3,806 298 Updated Nov 9, 2025

facebookresearch / xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 10,087 733 Updated Oct 31, 2025

OatmealDome / dolphin-ios

Dolphin for iOS, reborn

C++ 413 57 Updated Oct 23, 2025

sm64-port / sm64-port

Forked from n64decomp/sm64

A port of https://www.github.com/n64decomp/sm64 for modern devices.

C 1,135 176 Updated Nov 15, 2024

IBM / onnx-mlir-serving

ONNX Serving is a project written with C++ to serve onnx-mlir compiled models with GRPC and other protocols.Benefiting from C++ implementation, ONNX Serving has very low latency overhead and high t…

C++ 25 4 Updated Sep 17, 2025