Stars
OSUM & OSUM-EChat, open speech understanding model and empathetic spoken chatbot based on it, open-sourced by ASLP@NPU.
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…
Open Source framework for voice and multimodal conversational AI
A fast inference library for running LLMs locally on modern consumer-class GPUs
Free and Open Source, Distributed, RESTful Search Engine
Maix Speech AI lib, a fast and small speech lib running on embedded devices, including ASR, chat, TTS etc.
A 10000+ hours dataset for Chinese speech recognition
LvHang / aps
Forked from funcwj/apsA workspace for single/multi-channel speech recognition & enhancement & separation.
Espresso: A Fast End-to-End Neural Speech Recognition Toolkit
a lightweight speech processing toolkit based on Pytorch and (Py)Kaldi
Production First and Production Ready End-to-End Speech Recognition Toolkit
⚡ TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2. Supported languages that can use characters or subwords
FSA/FST algorithms, differentiable, with PyTorch compatibility.
kaldi-asr/kaldi is the official location of the Kaldi project.