Stars
The absolute trainer to light up AI agents.
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
GUICourse: From General Vision Langauge Models to Versatile GUI Agents
Run Windows apps such as Microsoft Office/Adobe in Linux (Ubuntu/Fedora) and GNOME/KDE as if they were a part of the native OS, including Nautilus integration. Hard fork of https://github.com/Fmst…
[CVPR 2024] FairCLIP: Harnessing Fairness in Vision-Language Learning
Reference PyTorch implementation and models for DINOv3
Official code for the paper "FairerCLIP: Debiasing CLIP’s Zero-Shot Predictions using Functions in RKHSs".
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Muon is an optimizer for hidden layers in neural networks
Kimi K2 is the large language model series developed by Moonshot AI team
Mirage Persistent Kernel: Compiling LLMs into a MegaKernel
Fast implementation of BERT inference directly on NVIDIA (CUDA, CUBLAS) and Intel MKL
Textbook on reinforcement learning from human feedback
An MCP server to run AppleScript and JXA (JavaScript for Automation) to macOS.
A native macOS app that allows users to chat with a local LLM that can respond with information from files, folders and websites on your Mac without installing any other software. Powered by llama.…
Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS …
This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."
Voice gender classifier using ECAPA-TDNN
Bringing BERT into modernity via both architecture changes and scaling
Python package for reading Adobe Photoshop PSD files
[WIP] Resources for AI engineers. Also contains supporting materials for the book AI Engineering (Chip Huyen, 2025)
A high-level, ergonomic Rust library for creating PDF documents.
Open-source high-performance RISC-V processor