Lists (5)
Sort Name ascending (A-Z)
Starred repositories
#1 Locally hosted web application that allows you to perform various operations on PDF files
Jobs scraper library for LinkedIn, Indeed, Glassdoor, Google, ZipRecruiter & more
Readest is a modern, feature-rich ebook reader designed for avid readers offering seamless cross-platform access, powerful tools, and an intuitive interface to elevate your reading experience.
A cross platform desktop reading app, based on the Readium Desktop toolkit
A free self-hostable speed reader. Highly customizable. Implements chunking (RSVP), pacing and highlighting. Modern UI and local-storage only.
Master programming by recreating your favorite technologies from scratch.
Tool for extracting important terms from a PDF and generating a printable index.
Collection of OCR-related python tools and wrappers from @OCR-D
bilalix / quranic-corpus
Forked from kaisdukes/quranic-corpusThe Quranic Arabic Corpus, an invaluable linguistic resource, is due for a revamp. We're calling on Linguistics, AI, and Tech volunteers to join us in this exciting journey. ๐
The Quranic Arabic Corpus, an invaluable linguistic resource, is due for a revamp. We're calling on Linguistics, AI, and Tech volunteers to join us in this exciting journey. ๐
A chrome/firefox extension that download books from Internet Archive(archive.org) and HathiTrust Digital Library (hathitrust.org)
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.
A post-processing tool for scanned sheets of paper.
Get your documents ready for gen AI
Toolkit for linearizing PDFs for LLM datasets/training
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
Full text, footnotes, and formatting of the ASV Bible (1901).
Explore machine learning and data science with Codespaces
A Github template for writing LaTeX documents collaboratively with automatic rendering using Github actions.
GitHub Action to compile LaTeX documents
Web interface for recognizing text, proofreading OCR, and creating fully-digitized documents.
Web based JavaScript GUI library for proofreading/editing hOCR
A web-based hOCR editor with visual overlay editing and intelligent OCR processing optimized for handwritten text.
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.