-
Easy-Turn Public
Forked from ASLP-lab/Easy-TurnOpen-Source Turn-Taking Detection Model and Dataset for Full-Duplex Spoken Dialogue Systems
Python Apache License 2.0 UpdatedOct 12, 2025 -
-
-
train-higgs-audio-jimmyMa99 Public
Forked from JimmyMa99/train-higgs-audioText-audio foundation model from Boson AI
Python UpdatedSep 4, 2025 -
WenetSpeech-Yue Public
Forked from ASLP-lab/WenetSpeech-YueA Large-scale Cantonese Speech Corpus with Multi-dimensional Annotation
Python Apache License 2.0 UpdatedSep 4, 2025 -
mair-hub Public
Forked from nvidia-china-sae/mair-hubJupyter Notebook Apache License 2.0 UpdatedAug 29, 2025 -
CarelessWhisper-Streaming Public
Forked from tomer9080/CarelessWhisper-StreamingCausal streaming adaptation of OpenAI Whisper for real-time transcription on small audio chunks.
Python Other UpdatedAug 21, 2025 -
wavesurfer Public
Forked from pengzhendong/wavesurferFor audio visualization and playback in Jupyter notebooks.
Python BSD 2-Clause "Simplified" License UpdatedAug 14, 2025 -
FluidAudio Public
Forked from FluidInference/FluidAudioFully Native Swift and CoreML. Efficient Speaker Diarization, VAD, and Speech-to-Text for realtime workloads
Swift Apache License 2.0 UpdatedAug 14, 2025 -
Cosyvoice_DPO_NOTES Public
Forked from ScottishFold007/Cosyvoice_DPO_NOTESCosyVoice_DPO_NOTES: Supercharge Your Cosyvoice model with Cutting-Edge DPO Fine-Tuning!
Python UpdatedAug 8, 2025 -
happy-llm Public
Forked from datawhalechina/happy-llm📚 从零开始的大语言模型原理与实践教程
Jupyter Notebook Other UpdatedJul 19, 2025 -
fireredasr-streaming Public
Forked from xphh/fireredasr-streaminglow-latency realtime ASR based on FireRedASR
Python MIT License UpdatedJul 8, 2025 -
CosyVoice Public
Forked from FunAudioLLM/CosyVoiceMulti-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Python Apache License 2.0 UpdatedJun 30, 2025 -
finetune-index-tts Public
Forked from yrom/finetune-index-ttsIndexTTS Fine-tuning notebooks
Jupyter Notebook MIT License UpdatedJun 17, 2025 -
-
GenVC Public
Forked from caizexin/GenVCSelf-supervised Generative LM-based Voice Conversion
Python MIT License UpdatedApr 16, 2025 -
async_cosyvoice Public
Forked from qi-hua/async_cosyvoice使用vllm加速cosyvoice2的推理
Jupyter Notebook Apache License 2.0 UpdatedApr 13, 2025 -
audioseal Public
Forked from facebookresearch/audiosealLocalized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector
Python MIT License UpdatedMar 27, 2025 -
CFPRF Public
Forked from ItzJuny/CFPRF[ACM MM'24] Coarse-to-Fine Proposal Refinement Framework for Audio Temporal Forgery Detection and Localization
Python MIT License UpdatedDec 20, 2024 -
WavChat Public
Forked from jishengpeng/WavChatA Survey of Spoken Dialogue Models (60 pages)
UpdatedNov 12, 2024 -
minimind Public
Forked from jingyaogong/minimind「大模型」3小时完全从0训练26M的小参数GPT,个人显卡即可推理训练!
Python Apache License 2.0 UpdatedNov 10, 2024 -
litgpt Public
Forked from Lightning-AI/litgpt20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
Python Apache License 2.0 UpdatedNov 1, 2024 -
scoreq Public
Forked from alessandroragano/scoreqSCOREQ: Speech COntrastive REgression for Quality Assessment (NeurIPS 2024)
Python UpdatedOct 18, 2024 -
F5-TTS Public
Forked from SWivid/F5-TTSOfficial code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Python MIT License UpdatedOct 10, 2024 -
GTSinger Public
Forked from AaronZ345/GTSingerDataset and code of GTSinger(NeurIPS 2024 Spotlight): A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks
Python Other UpdatedOct 10, 2024 -
mamba-diarization Public
Forked from nttcslab-sp/mamba-diarizationOfficial repository for Mamba-based Segmentation Model for Speaker Diarization
Python Other UpdatedOct 10, 2024 -
reverb Public
Forked from revdotcom/reverbOpen source inference code for Rev's model
Python Other UpdatedOct 7, 2024 -
SLAM-LLM Public
Forked from X-LANCE/SLAM-LLMSpeech, Language, Audio, Music Processing with Large Language Model
Python MIT License UpdatedOct 5, 2024 -
SSR-Speech Public
Forked from WangHelin1997/SSR-SpeechSSR-Speech: Towards Stable, Safe and Robust Zero-shot Speech Editing and Synthesis
Python MIT License UpdatedSep 22, 2024 -
TTS-arxiv-daily Public
Forked from liutaocode/TTS-arxiv-dailyAutomatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)
Python Apache License 2.0 UpdatedSep 22, 2024