Stars
A two step optimization for sound source separation on the adaptive front-end domain
speech enhancement\speech seperation\sound source localization
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.
cogmhear / avse_challenge
Forked from claritychallenge/clarityCOG-MHEAR Audio-Visual Speech Enhancement Challenge
verl: Volcano Engine Reinforcement Learning for LLMs
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
A high-throughput and memory-efficient inference and serving engine for LLMs
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & TIS & vLLM & Ray & Dynamic Sampling & Async Agentic RL)
A flexible and efficient codebase for training visually-conditioned language models (VLMs)
SGLang is a fast serving framework for large language models and vision language models.
Sequential Convex Programming Toolbox for nonconvex trajectory optimization.
Streamlit App Starter Kit helps kick start your Streamlit app creation.
This chatbot app is built using the Llama 2 open source LLM from Meta.
Template for creating applications that use the Delta Game Engine.
GregorR / rnnoise-nu
Forked from xiph/rnnoiseRecurrent neural network for audio noise reduction, slightly improved for general use
Seamless operability between C++11 and Python
Code and models for the paper "One Transformer Fits All Distributions in Multi-Modal Diffusion"
Official Code for Stable Cascade
Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.
Code for SuDoRm-Rf networks for efficient audio source separation. SuDoRm-Rf stands for SUccessive DOwnsampling and Resampling of Multi-Resolution Features which enables a more efficient way of sep…
DLRover: An Automatic Distributed Deep Learning System
This github repo is for Neurips 2021 and Interspeech 2022 papers on Non-Matching Reference based estimation of speech quality assessment.
Matlab code for Short-Time Fourier Transform Uncertainty Propagation (STFT-UP) (Phd Thesis 2010)
Speech Denoising WaveNet Architecture Implmentation in PyTorch
Source data, scripts and makefiles of the experiment for the Speex codec quality evaluation