-
neutts-air Public
Forked from neuphonic/neutts-airOn-device TTS model by Neuphonic
Python Apache License 2.0 UpdatedOct 29, 2025 -
chatterbox-vllm Public
Forked from randombk/chatterbox-vllmVLLM Port of the Chatterbox TTS model
Python MIT License UpdatedOct 18, 2025 -
sesame-finetune Public
Forked from knottwill/sesame-finetuneFinetune Sesame AI's conversational speech model on new languages and voices. Blog post: https://blog.speechmatics.com/sesame-finetune
Python MIT License UpdatedSep 27, 2025 -
distributed-llama Public
Forked from b4rtaz/distributed-llamaDistributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.
C++ MIT License UpdatedSep 6, 2025 -
Step-Audio2 Public
Forked from stepfun-ai/Step-Audio2Step-Audio 2 is an end-to-end multi-modal large language model designed for industry-strength audio understanding and speech conversation.
Python Apache License 2.0 UpdatedSep 1, 2025 -
phonemizer Public
Forked from zwhitchcox/phonemizerSimple text to phones converter for multiple languages
Python GNU General Public License v3.0 UpdatedAug 20, 2025 -
dia-finetuning Public
Forked from stlohrey/dia-finetuningA TTS model capable of generating ultra-realistic dialogue in one pass.
Python Apache License 2.0 UpdatedJul 25, 2025 -
piper1-gpl Public
Forked from OHF-Voice/piper1-gplFast and local neural text-to-speech engine
C++ GNU General Public License v3.0 UpdatedJul 15, 2025 -
mlx-vlm Public
Forked from Blaizzy/mlx-vlmMLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.
Python MIT License UpdatedJul 15, 2025 -
python3-sipsimple Public
Forked from AGProjects/python3-sipsimpleSIP SIMPLE SDK written in Python
Python Other UpdatedJul 14, 2025 -
-
chatterbox-streaming Public
Forked from davidbrowne17/chatterbox-streamingStreaming and Fine-tuning for Chatterbox TTS
Python MIT License UpdatedJun 15, 2025 -
piper-recording-studio Public
Forked from rhasspy/piper-recording-studioLocal voice recording for creating Piper datasets
JavaScript MIT License UpdatedJun 9, 2025 -
styletts2-inference Public
Forked from patriotyk/styletts2-inferenceOnnx compatible styletts2 code
Python MIT License UpdatedJun 8, 2025 -
WavLMMSDD Public
Forked from bunyaminergen/WavLMMSDDThis repository combines `WavLM`, a powerful speech representation model from Microsoft, with `MSDD` (Multi-Scale Diarization Decoder), a state-of-the-art approach for speaker diarization from Nvi…
Jupyter Notebook GNU General Public License v3.0 UpdatedMar 10, 2025 -
reverb Public
Forked from revdotcom/reverbOpen source inference code for Rev's model
Python Apache License 2.0 UpdatedNov 14, 2024 -
autovc Public
Forked from auspicious3000/autovcAutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
Python MIT License UpdatedOct 23, 2024 -
alpaca-lora Public
Forked from tloen/alpaca-loraInstruct-tune LLaMA on consumer hardware
Jupyter Notebook Apache License 2.0 UpdatedMar 25, 2024 -
llama.cpp Public
Forked from ggml-org/llama.cppLLM inference in C/C++
C++ MIT License UpdatedMar 8, 2024 -
panns_inference Public
Forked from qiuqiangkong/panns_inferencePython MIT License UpdatedMar 5, 2024 -
helix Public
Forked from helixml/helixCreate your own AI by fine-tuning open source models
Go Other UpdatedFeb 8, 2024 -
DeepPhonemizer Public
Grapheme to phoneme conversion with deep learning.
Python MIT License UpdatedDec 8, 2023 -
whisperX Public
Forked from m-bain/whisperXWhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Python BSD 4-Clause "Original" or "Old" License UpdatedNov 14, 2023 -
pyannote-audio Public
Forked from pyannote/pyannote-audioNeural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Jupyter Notebook MIT License UpdatedNov 14, 2023 -
-
whisper-cpp-python Public
Forked from carloscdias/whisper-cpp-pythonwhisper.cpp bindings for python
Python MIT License UpdatedAug 24, 2023 -
doctr Public
Forked from mindee/doctrdocTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
Python Apache License 2.0 UpdatedMar 17, 2023 -
PaddleOCR Public
Forked from PaddlePaddle/PaddleOCRAwesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and…
Python Apache License 2.0 UpdatedJan 31, 2022 -
txtai Public
Forked from neuml/txtai💡 Build AI-powered semantic search applications
Python Apache License 2.0 UpdatedJan 23, 2022 -
JD2Skills-BERT-XMLC Public
Forked from WING-NUS/JD2Skills-BERT-XMLCCode and Dataset for the Bhola et al. (2020) Retrieving Skills from Job Descriptions: A Language Model Based Extreme Multi-label Classification Framework
Python MIT License UpdatedSep 20, 2021