Skip to content
View cniclsh's full-sized avatar

Block or report cniclsh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

📑 PageIndex: Document Index for Reasoning-based RAG

Python 5,311 417 Updated Jan 8, 2026

A Unified Toolkit for Deep Learning Based Document Image Analysis

Python 5,633 518 Updated Aug 15, 2024

an ambient intelligence library

Python 6,051 393 Updated Jan 9, 2026

Software that makes labeling PDFs easy.

Python 425 79 Updated May 13, 2024
Java 26 12 Updated Mar 28, 2025

Enterprise-grade and API-first LLM workspace for unstructured documents, including data extraction, redaction, rights management, prompt playground, and more!

Python 1,127 115 Updated Jan 12, 2026

An open-source RAG-based tool for chatting with your documents.

Python 24,840 2,054 Updated Jul 4, 2025

A system for agentic LLM-powered data processing and ETL

Python 3,397 366 Updated Dec 30, 2025

Open Source Semantic Layer & Knowledge Engineering Framework

Python 289 40 Updated Jan 12, 2026

A simple screen parsing tool towards pure vision based GUI agent

Jupyter Notebook 24,193 2,080 Updated Sep 12, 2025

A practical, hands-on guide to building a small language model from scratch. Learn transformer architecture, attention mechanisms, and training techniques through step-by-step implementation with P…

Python 13 5 Updated Dec 10, 2025

Get your documents ready for gen AI

Python 49,880 3,456 Updated Jan 13, 2026

The simplest, fastest repository for training/finetuning small-sized VLMs.

Python 4,520 443 Updated Oct 27, 2025

PyTorch Implementation of Rasa's DIET Classifier.

Jupyter Notebook 17 1 Updated Dec 1, 2022

Source code to reproduce results of our paper "DIET: Lightweight Language Understanding for Dialogue Systems"

Python 64 14 Updated May 12, 2020

An Open Source Toolkit For LLM Distillation

Python 821 112 Updated Dec 21, 2025

Tools for merging pretrained large language models.

Python 6,677 655 Updated Jan 2, 2026

Everything about the SmolLM and SmolVLM family of models

Python 3,552 260 Updated Nov 20, 2025

Developer-friendly OSS embedded retrieval library for multimodal AI. Search More; Manage Less.

Rust 8,468 699 Updated Jan 12, 2026

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

Python 2,723 204 Updated Sep 9, 2025

Synthetic data generation for tabular data

Python 3,382 412 Updated Jan 12, 2026

LLM agents built for control. Designed for real-world use. Deployed in minutes.

Python 17,485 1,475 Updated Jan 12, 2026

Slick, declarative command line video editing & API

TypeScript 5,272 353 Updated May 12, 2025

Pure TypeScript media toolkit for reading, writing, and converting video and audio files, directly in the browser.

TypeScript 4,981 182 Updated Jan 12, 2026

🎥 Make videos programmatically with React

TypeScript 25,252 1,435 Updated Jan 12, 2026

Your Creative Copilot for Video Editing

TypeScript 972 104 Updated Jan 9, 2026

SD.Next: All-in-one WebUI for AI generative image and video creation

Python 6,876 538 Updated Jan 12, 2026

OCR, layout analysis, reading order, table recognition in 90+ languages

Python 19,096 1,309 Updated Oct 21, 2025

Zep | Examples, Integrations, & More

Python 3,958 568 Updated Jan 8, 2026

The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.

Python 2,442 223 Updated Jan 8, 2026
Next