-
Indicina
- Remote (Lagos)
- https://linkedin.com/in/ugochukwu-onyebuchi
- @VinciSon
- in/ugochukwu-onyebuchi
Lists (3)
Sort Name ascending (A-Z)
Stars
Instant voice cloning by MIT and MyShell. Audio foundation model.
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
Faster Whisper transcription with CTranslate2
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Multilingual Document Layout Parsing in a Single Vision-Language Model
FULL Augment Code, Claude Code, Cluely, CodeBuddy, Comet, Cursor, Devin AI, Junie, Kiro, Leap.new, Lovable, Manus Agent Tools, NotionAI, Orchids.app, Perplexity, Poke, Qoder, Replit, Same.dev, Trae…
Tesseract Open Source OCR Engine (main repository)
SkyReels-V2: Infinite-length Film Generative model
Speech To Speech: an effort for an open-sourced and modular GPT4-o
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
Document intelligence framework for Python - Extract text, metadata, and structured data from PDFs, images, Office documents, and more. Built on Pandoc, PDFium, and Tesseract.
A library that provides an embedded python distribution to be usable from inside golang
Build custom inference engines for models, agents, multi-modal systems, RAG, pipelines and more.
API service for docling document conversion
Running Docling as an API service
Get your documents ready for gen AI
scipts for working with open.bible data
Your one-stop solution for voice dataset creation
ChatShell is a productivity tool for the command-line, powered by OpenAI's GPT-3 language model. It helps users find shell commands quickly and easily, reducing the need to search online and improv…
idiap / coqui-ai-TTS
Forked from coqui-ai/TTS🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Free MLOps course from DataTalks.Club
OCR, layout analysis, reading order, table recognition in 90+ languages
Rembg is a tool to remove images background
Source code for the X Recommendation Algorithm
AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.