Skip to content
View ebezzam's full-sized avatar

Highlights

  • Pro

Organizations

@LCAV

Block or report ebezzam

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Soprano: Instant, Ultra-Realistic Text-to-Speech

Python 759 77 Updated Jan 12, 2026

Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.

Python 1,758 476 Updated Dec 15, 2025

SoTA open-source TTS

Python 21,347 2,784 Updated Dec 15, 2025

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 16,537 3,282 Updated Jan 13, 2026

The Hugging Face Course on Transformers for Audio

MDX 472 150 Updated Dec 18, 2025

Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages

Python 2,579 219 Updated Dec 30, 2025
Python 16 1 Updated Nov 19, 2025

SoulX-Podcast is an inference codebase by the Soul AI team for generating high-fidelity podcasts from text.

Python 3,017 381 Updated Dec 11, 2025
Jupyter Notebook 176 11 Updated Nov 3, 2025

🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.

Python 32,429 6,683 Updated Jan 13, 2026

On-device TTS model by Neuphonic

Python 4,340 462 Updated Dec 22, 2025

Official repo for MMAU-Pro Benchmark

Python 13 Updated Sep 25, 2025

Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection

Python 884 48 Updated Jun 3, 2025

Whisper realtime streaming for long speech-to-text transcription and translation

Python 3,516 411 Updated Nov 12, 2025

A fast multimodal LLM for real-time voice

Python 4,313 355 Updated Dec 12, 2025

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 19,600 2,106 Updated Oct 21, 2025

Open-Source Frontier Voice AI

Python 20,227 2,230 Updated Dec 17, 2025

Fast and memory-efficient exact attention

Python 21,582 2,278 Updated Jan 13, 2026

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 18,936 2,110 Updated Jan 12, 2026

Automatic Speech Recognition for Low-Resourced Middle Eastern Languages - Interspeech 2025

Python 3 Updated Jul 9, 2025

A package for NeuCodec: a 50hz, 0.8kbps, 24kHz audio codec.

Python 138 18 Updated Oct 7, 2025

Towards Human-Sounding Speech

Python 5,872 506 Updated Dec 5, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 67,403 12,557 Updated Jan 13, 2026

State-of-the-art TTS model under 25MB 😻

Python 9,450 491 Updated Aug 23, 2025

Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS …

Python 7,151 812 Updated Mar 5, 2025

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 44,211 5,908 Updated Aug 16, 2024
Next