Skip to content
View umarbutler's full-sized avatar

Highlights

  • Pro

Block or report umarbutler

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A Python library for interacting with AWS SageMaker AI deployments of the Isaacus API.

Python 1 Updated Oct 28, 2025

The code used to evaluate embedding models on the Massive Legal Embedding Benchmark (MLEB).

Python 25 3 Updated Nov 7, 2025
Python 1,236 117 Updated Oct 9, 2025

Official repo for the paper "Scaling Synthetic Data Creation with 1,000,000,000 Personas"

Python 1,402 114 Updated Feb 19, 2025

Unofficial faiss wheel builder for NVIDIA GPU

Python 26 5 Updated Nov 23, 2025

🚀 Efficient implementations of state-of-the-art linear attention models

Python 3,906 310 Updated Nov 25, 2025

Low-bit optimizers for PyTorch

Python 132 9 Updated Oct 9, 2023

PyTorch native quantization and sparsity for training and inference

Python 2,530 375 Updated Nov 25, 2025

Efficient Triton Kernels for LLM Training

Python 5,873 438 Updated Nov 23, 2025
Python 553 54 Updated Sep 23, 2025

This repository provides a clear, educational implementation of Byte Pair Encoding (BPE) tokenization in plain Python. The focus is on algorithmic understanding, not raw performance.

Python 11 1 Updated Aug 28, 2024

Fast bare-bones BPE for modern tokenizer training

Python 171 6 Updated Jun 23, 2025

(1) Courses on key machine learning and software engineering topics; (2) and a workflow for creating similar courses with LLMs in Cursor.

Python 12 Updated Jun 9, 2025

Python bindings for the PCRE2 library created by Philip Hazel

Python 15 5 Updated Nov 25, 2025

KL3M training data collection and preprocessing

Python 19 1 Updated Apr 14, 2025

Terminal bandwidth utilization tool

Rust 11,321 331 Updated Nov 3, 2025

A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.

Python 1,575 98 Updated May 28, 2025

Scripts for evaluating extractive question answering models on the LegalQAEval legal question answering benchmark.

Python 3 2 Updated May 19, 2025

A server-side JavaScript/TypeScript library for interacting with the Isaacus API, the world's first legal AI API.

TypeScript 2 2 Updated Nov 13, 2025

The OpenAPI specification for the Isaacus API

1 1 Updated Oct 26, 2025

A Python library for interacting with the Isaacus API, the world's first legal AI API.

Python 9 3 Updated Nov 22, 2025

The highest quality collection of up-to-date OpenAPI specifications for public APIs on the internet. This dataset also includes descriptions, categories, uptime metrics, and media assets for every …

61 9 Updated Nov 24, 2024

A GitHub action to build Stainless SDKs.

TypeScript 31 14 Updated Nov 17, 2025

Franken UI is an HTML-first UI component library built on UIkit 3 and extended with LitElement, inspired by shadcn/ui.

TypeScript 2,429 40 Updated Oct 21, 2025

A simple zero-config tool to make locally trusted development certificates with any names you'd like.

Go 57,236 3,008 Updated Aug 13, 2024

The most advanced frontend drag & drop page builder. Create high-end, pixel perfect websites at record speeds. Any theme, any page, any design.

PHP 6,771 1,484 Updated Nov 25, 2025

Get your documents ready for gen AI

Python 44,934 3,195 Updated Nov 25, 2025

A fast and lightweight Rust library for splitting text into semantically meaningful chunks.

Rust 3 Updated Jan 12, 2025

The No-Hassle CMS for Static Sites Generators

TypeScript 3,194 397 Updated Oct 9, 2025

Rust regex + PyO3

Rust 1 2 Updated Nov 10, 2025
Next