Stars
😝 TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
Voice Activity Detector (VAD) : low-latency, high-performance and lightweight
Production First and Production Ready End-to-End Speech Recognition Toolkit
MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba. Full multimodal LLM Android App:[MNN-LLM-Android](./apps/Android/MnnLlmChat/READ…
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
ncnn is a high-performance neural network inference framework optimized for the mobile platform
A modern GUI client based on Tauri, designed to run in Windows, macOS and Linux for tailored proxy experience
Dolphin is a multilingual, multitask ASR model jointly trained by DataoceanAI and Tsinghua University.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Visualizer for neural network, deep learning and machine learning models
Mars is a cross-platform network component developed by WeChat.
📄 Awesome OCR multiple programing languages toolkits based on ONNXRuntime, OpenVINO, PaddlePaddle and PyTorch.
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
All-in-One Development Tool based on PaddlePaddle
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
aasitnikov / fat-aar-android
Forked from kezong/fat-aar-androidGradle plugin for merging android libraries (AAR)
Instant voice cloning by MIT and MyShell. Audio foundation model.
FFmpeg Kit for applications. Supports Android, Flutter, iOS, Linux, macOS, React Native and tvOS. Supersedes MobileFFmpeg, flutter_ffmpeg and react-native-ffmpeg.
Multilingual Voice Understanding Model
Open source real-time translation app for Android that runs locally