Stars
📑 PageIndex: Document Index for Reasoning-based RAG
A Unified Toolkit for Deep Learning Based Document Image Analysis
Enterprise-grade and API-first LLM workspace for unstructured documents, including data extraction, redaction, rights management, prompt playground, and more!
An open-source RAG-based tool for chatting with your documents.
A system for agentic LLM-powered data processing and ETL
Open Source Semantic Layer & Knowledge Engineering Framework
A simple screen parsing tool towards pure vision based GUI agent
A practical, hands-on guide to building a small language model from scratch. Learn transformer architecture, attention mechanisms, and training techniques through step-by-step implementation with P…
Get your documents ready for gen AI
The simplest, fastest repository for training/finetuning small-sized VLMs.
PyTorch Implementation of Rasa's DIET Classifier.
Source code to reproduce results of our paper "DIET: Lightweight Language Understanding for Dialogue Systems"
An Open Source Toolkit For LLM Distillation
Tools for merging pretrained large language models.
Everything about the SmolLM and SmolVLM family of models
Developer-friendly OSS embedded retrieval library for multimodal AI. Search More; Manage Less.
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
LLM agents built for control. Designed for real-world use. Deployed in minutes.
Slick, declarative command line video editing & API
Pure TypeScript media toolkit for reading, writing, and converting video and audio files, directly in the browser.
🎥 Make videos programmatically with React
Your Creative Copilot for Video Editing
SD.Next: All-in-one WebUI for AI generative image and video creation
OCR, layout analysis, reading order, table recognition in 90+ languages
The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.