Skip to content
View ruclion's full-sized avatar

Block or report ruclion

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Record Audio from the User's Microphone in Apps that are Deployed to the Web. (via Browser Media-API, REACT-based, Streamlit Custom Component)

TypeScript 494 98 Updated Sep 11, 2023

全中文注释.(The loss function of retinanet based on pytorch).(You can use it on one-stage detection task or classifical task, to solve data imbalance influence).用于one-stage目标检测算法,提升检测效果.你也可以在分类任务中使用该损失函…

Jupyter Notebook 491 112 Updated Oct 9, 2025
HTML 46 18 Updated Oct 10, 2025

尝试使用神经网络生成音乐游戏Malody的谱面。

Jupyter Notebook 51 13 Updated Feb 19, 2020

library to read/write .npy and .npz files in C/C++

C++ 1,440 326 Updated Jan 18, 2023

A pure python module for reading and writing kaldi ark files

Python 267 37 Updated Mar 6, 2025

An Open Source Machine Learning Framework for Everyone

C++ 192,591 75,002 Updated Nov 28, 2025

Run TensorFlow models in C++ without installation and without Bazel

C++ 809 181 Updated Aug 16, 2024

"Recurrent Models of Visual Attention" in TensorFlow

Python 41 9 Updated Apr 13, 2017

ACLEW Diarization Virtual Machine

Shell 34 9 Updated Jul 29, 2019

Deep neural network based speech enhancement toolkit

MATLAB 217 62 Updated Jun 14, 2019

Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.

MATLAB 866 234 Updated Jun 9, 2021

A tool/script for batch speech data enhancement with speed/volume/RIRS/MUSAN

Shell 24 5 Updated Jun 28, 2020

Evaluation of the classification performance (Speech, Music, and Noise) of 1D (WaveNet) and 2D (MobileNet) CNN and RNN (GRU) on the MUSAN corpus.

Python 14 10 Updated Sep 23, 2020

Robust Speech Activity Detection (SAD) in movie audio

Python 26 10 Updated Jan 27, 2021

The codebase for Data-driven general-purpose voice activity detection.

Python 94 23 Updated Aug 3, 2023

Repository for our Interspeech2020 general-purpose voice activity detection (GPVAD) paper

Python 141 29 Updated Aug 3, 2023

Diarization scoring tools.

Python 260 46 Updated Mar 28, 2023

Python interface to the WebRTC Voice Activity Detector

C 2,405 424 Updated Jul 4, 2024

Code for reproducing experiments in "Domain-Adversarial Voice Activity Detection"

Jupyter Notebook 23 4 Updated Mar 3, 2020

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python 7,466 678 Updated Nov 25, 2025

Representation of Paper: On training targets for noise-robust voice activity detection.

Jupyter Notebook 5 2 Updated Jun 17, 2021

Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".

Python 149 15 Updated Jul 13, 2023

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

Jupyter Notebook 1,386 237 Updated May 21, 2023

🗣️ A book and repo to get you started programming voice computing applications in Python (10 chapters and 200+ scripts).

Python 387 88 Updated Dec 8, 2022

🎵 A repository for manually annotating files to create labeled acoustic datasets for machine learning.

Python 46 3 Updated Feb 20, 2022

📊 Easily apply audio-related machine learning models trained on the AudioSet dataset (527+ models/classes).

Python 31 12 Updated Jun 17, 2024
Next