Lists (22)
Sort Name ascending (A-Z)
Starred repositories
《神经网络与深度学习》 邱锡鹏著 Neural Network and Deep Learning
Xmart青年论坛仓库,存放历史学生论坛和前沿讲座的视频回放和讲义,获取最新Xmart预告欢迎关注公众号【XLANCE Lab】
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
Keyword spotting on Arm Cortex-M Microcontrollers
Awesome speech/audio LLMs, representation learning, and codec models
Paper, Code and Resources for Speech Language Model and End2End Speech Dialogue System.
Paper, Code and Resources for Speech Language Model and End2End Speech Dialogue System.
This is a speech analysis, modification and synthesis system
ASLP-lab / DiffRhythm2
Forked from xiaomi-research/diffrhythm2Di♪♪Rhythm 2: Efficient And High Fidelity Song Generation Via Block Flow Matching
A library for speech data augmentation in time-domain
Open-Source Chrome extension for AI-powered web automation. Run multi-agent workflows using your own LLM API key. Alternative to OpenAI Operator.
Examples of my Claude Code infrastructure with skill auto-activation, hooks, and agents
12 Lessons to Get Started Building AI Agents
AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio a…
Join the community on Discord for more discussions around Neutone! https://discord.gg/VHSMzb8Wqp
Deezer source separation library including pretrained models.
kyutai-labs / nanoGPTaudio
Forked from karpathy/nanoGPTCode for the blog "Neural audio codecs: how to get audio into LLMs"
Whisper-Flamingo [Interspeech 2024] and mWhisper-Flamingo [IEEE SPL 2025] for Audio-Visual Speech Recognition and Translation
中文翻译的 Hands-On-Large-Language-Models (hands-on-llms),动手学习大模型
implementation of Monaural Speech Enhancement with Recursive Learning in the Time Domain
We Speech Toolkit, LLM based Speech Toolkit for Speech Understanding, Generation, and Interaction
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
LLMs-from-scratch项目中文翻译