Skip to content
Change the repository type filter

All

    Repositories list

    • SoTA open-source TTS
      Python
      2k15k17844Updated Sep 25, 2025Sep 25, 2025
    • 01730Updated Sep 4, 2025Sep 4, 2025
    • resemble.ai API SDK
      TypeScript
      41331Updated Aug 12, 2025Aug 12, 2025
    • Perth

      Public
      Open Audio Watermarking Tool
      Python
      3437361Updated Jun 26, 2025Jun 26, 2025
    • xformers

      Public
      Hackable and optimized Transformers building blocks, supporting a composable construction.
      Python
      733000Updated Jun 23, 2025Jun 23, 2025
    • monotonic_align

      Public
      Monotonic Alignment Search
      Cython
      169700Updated Jun 9, 2025Jun 9, 2025
    • flowhigh

      Public
      [ICASSP 2025] "FLowHigh: Towards efficient and high-quality audio super-resolution with single-step flow matching"
      Python
      121100Updated May 12, 2025May 12, 2025
    • espeak-ng

      Public
      eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
      C
      1.1k400Updated Mar 31, 2025Mar 31, 2025
    • agents

      Public
      Build real-time multimodal AI applications 🤖🎙️📹
      Python
      1.6k500Updated Mar 20, 2025Mar 20, 2025
    • agents-js

      Public
      Build realtime multimodal AI agents with Node.js
      TypeScript
      180300Updated Mar 18, 2025Mar 18, 2025
    • fairseq

      Public
      Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
      Python
      6.6k000Updated Feb 27, 2025Feb 27, 2025
    • A python package for calculating the PESQ.
      Python
      75000Updated Feb 22, 2025Feb 22, 2025
    • PyTSMod

      Public
      An open-source Python library for audio time-scale modification.
      Python
      29600Updated Feb 13, 2025Feb 13, 2025
    • peft

      Public
      🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
      Python
      2.1k100Updated Jan 2, 2025Jan 2, 2025
    • AI powered speech denoising and enhancement
      Python
      2462k543Updated Dec 3, 2024Dec 3, 2024
    • mup

      Public
      maximal update parametrization (µP)
      Jupyter Notebook
      104001Updated Sep 5, 2024Sep 5, 2024
    • Python
      0510Updated Sep 5, 2024Sep 5, 2024
    • Python
      1610Updated May 8, 2024May 8, 2024
    • aiortc

      Public
      WebRTC and ORTC implementation for Python using asyncio
      Python
      851000Updated Mar 27, 2024Mar 27, 2024
    • aioice

      Public
      asyncio-based Interactive Connectivity Establishment (RFC 5245)
      Python
      64000Updated Feb 15, 2024Feb 15, 2024
    • TypeScript
      41600Updated Dec 16, 2023Dec 16, 2023
    • Go
      1200Updated Nov 13, 2023Nov 13, 2023
    • Run OpenAI Whisper as a Cog model
      Python
      50200Updated Nov 8, 2023Nov 8, 2023
    • Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀
      Python
      153000Updated Oct 25, 2023Oct 25, 2023
    • A python package to analyze and compare voices with deep learning
      Python
      4683.1k422Updated Oct 12, 2023Oct 12, 2023
    • A Heroku buildpack for ffmpeg that always downloads the latest static build
      Shell
      721000Updated Aug 21, 2023Aug 21, 2023
    • g2pW

      Public
      Chinese Mandarin Grapheme-to-Phoneme Converter. 中文轉注音或拼音 (INTERSPEECH 2022)
      Python
      46000Updated Jul 8, 2023Jul 8, 2023
    • univnet

      Public
      Unofficial PyTorch Implementation of UnivNet Vocoder (https://arxiv.org/abs/2106.07889)
      Python
      45000Updated May 19, 2023May 19, 2023
    • NeMo

      Public
      NeMo: a toolkit for conversational AI
      Python
      3.2k900Updated Jan 18, 2023Jan 18, 2023
    • Simple text to phonemes converter for multiple languages
      Python
      1932001Updated Nov 21, 2022Nov 21, 2022