-
Hugging Face
- France
- https://ebezzam.github.io/
- @ericbezzam
- in/eric-bezzam
Highlights
- Pro
Lists (4)
Sort Name ascending (A-Z)
Stars
Soprano: Instant, Ultra-Realistic Text-to-Speech
Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
The Hugging Face Course on Transformers for Audio
Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages
SoulX-Podcast is an inference codebase by the Soul AI team for generating high-fidelity podcasts from text.
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection
Whisper realtime streaming for long speech-to-text transcription and translation
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Fast and memory-efficient exact attention
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Automatic Speech Recognition for Low-Resourced Middle Eastern Languages - Interspeech 2025
A package for NeuCodec: a 50hz, 0.8kbps, 24kHz audio codec.
A high-throughput and memory-efficient inference and serving engine for LLMs
Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS …
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production