Skip to content
View SaoYear's full-sized avatar
🙃
Focusing
🙃
Focusing

Organizations

@Audio-WestlakeU

Block or report SaoYear

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Official PyTorch implementation of 'Rec-RIR: Monaural Blind Room Impulse Response Identification via DNN-based Reverberant Speech Reconstruction in STFT Domain'

Python 24 2 Updated Nov 5, 2025

Extracted YouTube 8M URLs and Labels without all the TF Record parsing/features

Python 30 7 Updated Nov 19, 2023

Code for ICLR 2025 Paper: Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data

Python 11 Updated Mar 31, 2025

Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation Pytorch's Implement

Python 519 79 Updated May 26, 2023

Conformer-based Metric GAN for speech enhancement

Python 395 67 Updated May 3, 2024

Open implementation of UNIVERSE and UNIVERSE++ diffusion-based speech enhancement models.

Python 109 13 Updated Aug 29, 2024

This is the official implementation of the LiSenNet

Python 135 16 Updated Nov 15, 2024

Download AudioSet for Vision-Audio-Text Pre-training

Python 13 2 Updated May 16, 2022

Pytorch implementation of "CleanMel: Mel-Spectrogram Enhancement for Improving Both Speech Quality and ASR".

Python 77 10 Updated Aug 7, 2025

Official PyTorch implementation of 'VINP: Variational Bayesian Inference with Neural Speech Prior for Joint ASR-Effective Speech Dereverberation and Blind RIR Identification' [IEEE TASLP]

Python 25 5 Updated Nov 1, 2025

Masked Modeling Duo: Towards a Universal Audio Pre-training Framework

Jupyter Notebook 124 7 Updated Nov 12, 2025

Pytorch implementation of a cosine annealing learning scheduler with linear warmup

2 Updated Jul 25, 2024

A Pytorch-based implementation of the compression and decompression module in "Ultra Dual-Path Compression For Joint Echo Cancellation And Noise Suppression".

Jupyter Notebook 60 7 Updated Feb 20, 2024

StoRM: A Diffusion-based Stochastic Regeneration Model for Speech Enhancement and Dereverberation

Python 245 31 Updated Sep 13, 2024

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a ca…

Python 1,849 310 Updated Mar 14, 2023
Python 19 Updated Mar 6, 2025

MOS score prediction by fine-tuned wav2vec2.0 model

Python 171 22 Updated Oct 20, 2022

Generation scripts for EARS-WHAM and EARS-Reverb

Python 41 6 Updated Jul 4, 2025

UT-Sarulab MOS prediction system using SSL models

Python 283 15 Updated Apr 11, 2024

Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis

Python 1,018 120 Updated Aug 7, 2024
Python 22 5 Updated Dec 14, 2023

Da - ECHO - RetrievAl - daTasEt

Jupyter Notebook 32 4 Updated Jul 7, 2024

Python library that reads JSON files of any size.

Python 196 26 Updated Feb 16, 2023

This repository aims to collect Transformer-based sound event detection (SED) algorithms.

Jupyter Notebook 80 6 Updated Nov 4, 2025

Convert Wechat Silk to wav file.

Rust 5 1 Updated Apr 25, 2023

UTokyo-SaruLab MOS Prediction System

Python 266 27 Updated Nov 30, 2025

Mamba SSM architecture

Python 16,567 1,517 Updated Nov 11, 2025

A description of "RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization" [NeurIPS 2024]

Python 142 16 Updated Apr 29, 2025

A generative speech model for daily dialogue.

Python 38,230 4,154 Updated Nov 27, 2025

Baseline code for DCASE 2023 task 4 B

Python 14 3 Updated Apr 21, 2023
Next