Skip to content
View jomangy's full-sized avatar

Block or report jomangy

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Code for SuDoRm-Rf networks for efficient audio source separation. SuDoRm-Rf stands for SUccessive DOwnsampling and Resampling of Multi-Resolution Features which enables a more efficient way of sep…

Jupyter Notebook 335 34 Updated Jul 6, 2023

Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation

Python 701 101 Updated Aug 22, 2025

A concise but complete full-attention transformer with a set of promising experimental features from various papers

Python 5,762 499 Updated Jan 6, 2026
Python 53 8 Updated May 15, 2025

A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.

MATLAB 814 153 Updated Dec 1, 2020

Self-Supervised Noise Embeddings (Self-SNE)

Jupyter Notebook 158 12 Updated Apr 3, 2025

Concolic Testing for Deep Neural Networks

Python 119 45 Updated Jul 16, 2021

Training General-Purpose Audio Tagging Networks with Noisy Labels and Iterative Self-Verification

Python 29 10 Updated May 10, 2019

A PyTorch Implementation of End-to-End Models for Speech-to-Text

Python 769 178 Updated Jul 6, 2023

Fast and flexible AutoML with learning guarantees.

Jupyter Notebook 3,459 528 Updated Nov 30, 2023

OpenL3: Open-source deep audio and image embeddings

Jupyter Notebook 571 62 Updated Jun 17, 2023

Learn and L3 embedding from audio/video pairs

Jupyter Notebook 88 20 Updated Apr 24, 2022

A Python wrapper for Kaldi

Python 1,030 248 Updated Nov 30, 2025

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding a…

Python 2,392 446 Updated Mar 14, 2022

Code for the paper "Language Models are Unsupervised Multitask Learners"

Python 24,539 5,849 Updated Aug 14, 2024

Face recognition with deep neural networks.

Lua 15,390 3,585 Updated Oct 4, 2024

Home surveillance system with facial recognition

HTML 1,256 388 Updated Nov 22, 2022

kaldi-asr/kaldi is the official location of the Kaldi project.

Shell 15,299 5,365 Updated Sep 22, 2025

Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.

Python 1,089 339 Updated Jun 8, 2024

In this repository, I will share some useful notes and references about deploying deep learning-based models in production.

4,380 697 Updated Nov 9, 2024

A platform for the collaborative creation of open audio collections labeled by humans and based on Freesound content.

Python 143 11 Updated Oct 6, 2023

A Cooperative Voice Analysis Repository for Speech Technologies

MATLAB 369 116 Updated Jul 27, 2020

Interactive Data Visualization in the browser, from Python

TypeScript 20,287 4,249 Updated Jan 9, 2026

Declarative visualization library for Python

Python 10,203 833 Updated Jan 7, 2026

A Python implementation of Deep Belief Networks built upon NumPy and TensorFlow with scikit-learn compatibility

Python 496 210 Updated Mar 25, 2023
Jupyter Notebook 454 293 Updated Sep 24, 2017

Praat: Doing Phonetics By Computer

C 1,815 273 Updated Jan 5, 2026

Speech Enhancement Generative Adversarial Network in TensorFlow

Python 854 281 Updated Mar 24, 2023

Starter code for working with the YouTube-8M dataset.

Python 2,371 850 Updated Oct 25, 2021
Next