Stars
Code for SuDoRm-Rf networks for efficient audio source separation. SuDoRm-Rf stands for SUccessive DOwnsampling and Resampling of Multi-Resolution Features which enables a more efficient way of sep…
Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation
A concise but complete full-attention transformer with a set of promising experimental features from various papers
A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.
Self-Supervised Noise Embeddings (Self-SNE)
Training General-Purpose Audio Tagging Networks with Noisy Labels and Iterative Self-Verification
A PyTorch Implementation of End-to-End Models for Speech-to-Text
Fast and flexible AutoML with learning guarantees.
OpenL3: Open-source deep audio and image embeddings
Learn and L3 embedding from audio/video pairs
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding a…
Code for the paper "Language Models are Unsupervised Multitask Learners"
Face recognition with deep neural networks.
Home surveillance system with facial recognition
kaldi-asr/kaldi is the official location of the Kaldi project.
Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.
In this repository, I will share some useful notes and references about deploying deep learning-based models in production.
A platform for the collaborative creation of open audio collections labeled by humans and based on Freesound content.
A Cooperative Voice Analysis Repository for Speech Technologies
Interactive Data Visualization in the browser, from Python
A Python implementation of Deep Belief Networks built upon NumPy and TensorFlow with scikit-learn compatibility
Speech Enhancement Generative Adversarial Network in TensorFlow
Starter code for working with the YouTube-8M dataset.