speechprojects

All

38 repositories

InspireMusic
Public
InspireMusic: A Unified Framework for Music, Song, Audio Generation.
Python
•
Apache License 2.0
•121•0•0•0•Updated May 9, 2025May 9, 2025
PDMX
Public
PDMX: A Large-Scale Public Domain MusicXML Dataset for Symbolic Music Processing
Python
•
MIT License
•4•0•0•0•Updated Oct 2, 2024Oct 2, 2024
lycon
Public
Python
•1•0•0•0•Updated Sep 1, 2024Sep 1, 2024
diarizers
Public
Python
•23•0•0•0•Updated Jun 14, 2024Jun 14, 2024
speech-trident
Public
Awesome speech/audio LLMs, representation learning, and codec models
71•0•0•0•Updated Apr 13, 2024Apr 13, 2024
VoiceCraft
Public
Zero-Shot Speech Editing and Text-to-Speech in the Wild
Jupyter Notebook
•
Other
•797•0•0•0•Updated Mar 29, 2024Mar 29, 2024
audiocraft
Public
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
Python
•
MIT License
•2.5k•0•0•0•Updated Jun 25, 2023Jun 25, 2023
whisper.cpp
Public
Port of OpenAI's Whisper model in C/C++
C
•
MIT License
•4.8k•0•0•0•Updated Feb 18, 2023Feb 18, 2023
audio-diffusion-pytorch
Public
Audio generation using diffusion models, in PyTorch.
Python
•
MIT License
•177•0•0•0•Updated Aug 17, 2022Aug 17, 2022
leaf-audio
Public
LEAF is a learnable alternative to audio features such as mel-filterbanks, that can be initialized as an approximation of mel-filterbanks, and then be trained for the task at hand, while using a very small number of parameters.
Python
•
Apache License 2.0
•53•0•0•0•Updated Mar 1, 2022Mar 1, 2022
NATSpeech
Public
A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official PyTorch implementation of PortaSpeech (NeurIPS 2021) and DiffSpeech (AAAI 2022)
Python
•
MIT License
•103•0•0•0•Updated Feb 17, 2022Feb 17, 2022
AutoEq
Public
Automatic headphone equalization from frequency responses
Jupyter Notebook
•
MIT License
•2.5k•0•0•0•Updated Dec 2, 2021Dec 2, 2021
soundata
Public
Python library for downloading, loading & working with sound datasets
Python
•
BSD 3-Clause "New" or "Revised" License
•27•0•0•0•Updated Nov 24, 2021Nov 24, 2021
TTS
Public
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Jupyter Notebook
•
Mozilla Public License 2.0
•5.7k•0•0•0•Updated Sep 17, 2021Sep 17, 2021
praudio
Public
Audio preprocessing framework for Deep Learning audio applications
Python
•
MIT License
•10•0•0•0•Updated Aug 27, 2021Aug 27, 2021
OpenSpeech
Public
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.
Python
•
MIT License
•115•0•0•0•Updated Jun 9, 2021Jun 9, 2021
voice2json
Public
Command-line tools for speech and intent recognition on Linux
Python
•
MIT License
•67•0•0•0•Updated May 21, 2021May 21, 2021
speechbrain.github.io
Public
The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.
HTML
•31•0•0•0•Updated Mar 18, 2021Mar 18, 2021
flashlight
Public
A C++ standalone library for machine learning
C++
•
Other
•501•0•0•0•Updated Mar 4, 2021Mar 4, 2021
torch-audiomentations
Public
Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
Python
•
MIT License
•96•0•0•0•Updated Feb 24, 2021Feb 24, 2021
spokestack-python
Public
Spokestack is a library that allows a user to easily incorporate a voice interface into a Python application.
Python
•
Apache License 2.0
•14•0•0•0•Updated Jan 27, 2021Jan 27, 2021
pyttsx3
Public
Offline Text To Speech synthesis for python
Python
•
GNU General Public License v3.0
•353•0•0•0•Updated Sep 30, 2020Sep 30, 2020
espnet
Public
End-to-End Speech Processing Toolkit
Python
•
Apache License 2.0
•2.3k•0•0•0•Updated Jun 5, 2020Jun 5, 2020
DeepMusicClassification
Public
An implementation of a Convolutional Neural Network to Classify Music Genres
Python
•
MIT License
•9•0•0•0•Updated Sep 5, 2019Sep 5, 2019
MusicTransformer-tensorflow2.0
Public
implementation of music transformer with tensorflow-2.0 (ICLR2019)
Python
•
MIT License
•78•0•0•0•Updated Aug 12, 2019Aug 12, 2019
lmms
Public
Cross-platform music production software
C++
•
GNU General Public License v2.0
•1.1k•0•0•0•Updated Apr 17, 2019Apr 17, 2019
snickery
Public
Hybrid speech synthesiser
Python
•
Apache License 2.0
•6•0•0•0•Updated Nov 6, 2018Nov 6, 2018
Deep_Speaker-speaker_recognition_system
Public
Keras implementation of ‘’Deep Speaker: an End-to-End Neural Speaker Embedding System‘’ (speaker recognition)
Python
•80•0•0•0•Updated Oct 5, 2018Oct 5, 2018
pytheory
Public
Music Theory for Humans.
Python
•80•0•0•0•Updated Sep 10, 2018Sep 10, 2018
amodem
Public
Audio MODEM Communication Library in Python
Python
•
Other
•129•0•0•0•Updated Jun 18, 2018Jun 18, 2018